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__| PREFACE 


Welcome to Xerox Imaging Systems’ TextBridge™, 
OCR for Windows™. TextBridge incorporates power- 
ful optical character recognition technology and 
an easy-to-use interface so you can quickly convert 
hard copy and on-line images into formatted text files. 


Files produced by TextBridge are compatible with a 
variety of word processing, desktop publishing, data 
base, and spreadsheet applications. 


Before going on to find out more about TextBridge, 
please read this preface, as it describes these 
important items: 


e §6About this manual 
e Documentation conventions 
e Related publications 


e Customer support 


ABOUT THIS MANUAL 


Note 


This user’s guide includes introductory and procedural 
information designed primarily for non-technical 
users. However, you should be familiar with the 
management and operation of your personal computer 
(PC) and Microsoft Windows, version 3.1. 


This manual should provide all the information you 
need to operate TextBridge. However, Xerox Imaging 
Systems invites your comments about the information 
provided here. Please fill out the accompanying 
registration card and mail it as directed to Xerox 
Imaging Systems, Inc. 
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Organization of this manual 


VILL 


This manual is designed mainly as a reference tool 
with many practical tips. It is organized as follows: 


Chapter 1, “Overview,” discusses TextBridge 
features and benefits, the TextBridge Help 
system, and how to proceed with installing and 
using TextBridge. 


Chapter 2, “Installation,” provides complete 
information about installing TextBridge and 
configuring your PC for using the application. 


Chapter 3, “Using TextBridge,” provides step-by- 
step procedures to process hard copy and on-line 
document images to usable text files on your PC. 


Chapter 4, “Tips and Techniques,” provides 
practical suggestions for getting the best 
performance from TextBridge. It also describes 
the use of TextBridge OCR from within other 
applications. 


Appendix A, “Troubleshooting and Error 
Correction,” describes common problems that you 
can encounter during TextBridge installation, sys- 
tem configuration, and operation. It reeommends 
a solution for each of the problems. It also lists the 
error messages that can be generated during 
TextBridge operation and suggests ways for cor- 
recting the errors. 


The “Glossary of Terms” defines words, phrases, 
and concepts used in this manual. 


This manual also provides a comprehensive index for 
quickly locating the information you need. 
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Documentation conventions 


As described in Table P-1, TextBridge documentation 
uses certain graphical elements and formatting to 
emphasize information and denote meaning in text. 


Table P-1. Documentation Conventions 


Convention 


bold 


italic 


monospace 


“ ” (quotes) 


Note 


Preface 


Description 


Introduces a new term, or the first 
use of an important term in a 
chapter; also sometimes used to 
denote strong in-line emphasis. 


Denotes titles of other manuals or 
books. Also used to denote generic 
representations of file name entries 
in examples, for example, 


filename .tif 


Denotes examples, menu text, actual 
file names or messages that appear 
on the computer screen. 


Denotes titles of chapters and 
sections in this manual. Also used for 
values that you may type into a 
menu. For example: 


Enter a number from “1” to “10” ... 


Introduces tips that provide useful 
information about a procedural step 
or system function. 


Introduces information of note about 
the current subject. 


1x 


RELATED PUBLICATIONS 


Refer to the TextBridge Quick Reference card for 
capsule summaries of TextBridge operation. 


Refer also to the on-line Release Notes included with 
the product. 


t= The Release Notes document, in Microsoft Write 
format, is included on the TextBridge installation 
disks. After you install TextBridge, the Release 
Notes document automatically appears in the 
TextBridge OCR program group under the 
Windows Program Manager: 


J 


Release Notes 


Double-click the icon to view up-to-date informa- 
tion that is not in the standard documentation set. 
Please read the Release Notes carefully before 
proceeding with TextBridge operation. 


TextBridge provides drivers for a number of popular 
desktop scanners. Refer to the scanner manufacturer’s 
documentation for complete information on the 
scanner. 


Finally, refer to Microsoft’s Windows 3.1 User’s Guide 
for information about operating and configuring 
Windows on your PC. 
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CUSTOMER SUPPORT 


If you should experience problems with TextBridge 
that you cannot resolve, do the following: 


First, consult Appendix A of this manual for trouble- 
shooting information, and a list and description of 
errors and suggestions to correct them. 


If you purchased TextBridge from a Xerox Imaging 
Systems’ authorized reseller, and you cannot resolve 
the problem, call the reseller for assistance. 


If you should need to call, be ready to provide: 


e your software registration number (the serial 
number on Disk 1 of the original TextBridge 
installation diskettes) 


e adescription of the steps that led up to the 
problem 


e if TextBridge generated an error message, a 


verbatim description of the error message (and/or 
number) 
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1] OVERVIEW 


Welcome to Xerox Imaging Systems’ TextBridge™ 
OCR for Windows. OCR stands for optical char- 
acter recognition, the capability to read an image 
and recognize and output text from it. Images can 
come from scanners, fax modems, or other sources. 


TextBridge is a suite of applications that combines 
Xerox Imaging Systems’ industry-leading OCR with 
simple Microsoft Windows interfaces. 


Figure 1—1 shows the TextBridge Main dialog, which 
you can access as a stand-alone program or embedded 
in other text applications. 


= TextBridge 


Input From 


© File 
: Hy 


[" Preview 

T" Verify Preferences... 
T” Save Page Images 

Status 

Scanner detected: ISIS About... 


English language loaded. 


Figure 1-1. Main dialog 


To work with imaging and fax applications, Text- 
Bridge provides an OCR Printer, which is selected 
through the application’s Print Setup command. Then, 
using the Print command of the application, you can 
“print” an image to the OCR Printer, and save 
recognized text to a file in the text format of choice 
(for example, your word processor format). 
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TextBridge applications can convert recognized text to 
a number of word processing, spreadsheet, database, 
and other text formats. 


TextBridge runs on Windows-compatible personal 
computers with an 80386 (or more powerful) 
microprocessor and at least four megabytes of system 
memory (eight megabytes is recommended). 


The program runs on DOS Version 5.0 (or later) and 
Microsoft Windows, Version 3.1 (or later) in 
enhanced mode only. 


TextBridge will also run under IBM’s OS/2® 
operating system (version 2.0). 


WHAT IS TEXTBRIDGE OCR? 


TextBridge OCR is software that turns image data 
into usable text files on your PC. 


With TextBridge OCR, you can access the valuable 
data locked inside paper documents, as well as on-line 
faxes and page images from other sources. 


TextBridge recognizes and converts scanned and on- 
line page images to text files that you can open, edit, 
reformat, republish, and otherwise apply (Figure 1-2). 


TextBridge supports a number of popular desktop 
scanners and converts recognized text to many 
popular text formats. 


Use your scanner to input hard copy documents to 
TextBridge, which takes the scanned images, 
performs OCR, converts the recognized text to the 
format of your choice, and stores it as a PC file. 


Alternatively, use TextBridge to recognize and convert 
on-line page images stored in TIFF (Tag Image File 
Format). These images can come from fax modems or 
other sources. 
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Image is sent —— 
to the program 


TextBridge performs OCR 
and text conversion 


On-line 
image file 
(TIFF) 


On-line text 
file, e.g., 
An on-line file in the specified ——————_ | | WordPerfect 
text format is written to disk 


Figure 1-2. TextBridge OCR process 


TextBridge provides an easy-to-use interface and a 
powerful set of built-in capabilities. For example, the 
preview tool lets you view and zone pages before 
OCR. The verifier lets you interact with the OCR 
software to achieve the highest possible accuracy. 


TextBridge incorporates Xerox Imaging Systems’ 
industry-leading document recognition software, and 
includes a number of technologies developed by Xerox 
Palo Alto Research Center (PARC), where modern 
computer interfaces originated. 
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Consequently, TextBridge provides the most accurate 
OCR and format retention results on the widest range 
of documents: 


documents with pointsizes ranging from 6-point to 
72-point type in practically any typeface 


ts TextBridge can reliably recognize type smaller 
than 8-point only from images scanned at 400 
dots per inch resolution. 


documents printed on typewriters, phototype- 
setters, and impact, ink-jet, and laser printers 


photocopied, degraded, or dirty documents 


documents with single- or multiple-column 
layouts 


documents containing halftones 


on-line single- or multiple-page TIFF images from 
fax modems and other sources 


hard-copy faxes 


documents composed in English, French, Italian, 
German, or Spanish 


tS TextBridge versions shipped in international 
markets can recognize an even greater 
number of languages. 


SUPPORTED TEXT FORMATS 


1M 


TextBridge can convert recognized text to a number of 
output formats. With some formats, TextBridge sup- 
ports multiple versions, as shown in Table 1-1. 


Note that this list is subject to change. Refer to the 
on-line TextBridge Release Notes for the latest 
information. 
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Table 1-1. Supported Text Formats 


Application 
Ami Pro 3.0 
Ami Pro 2.0 
ASCII Standard 
ASCII Smart 
ASCII Stripped 
dBase IV 
DCA/RFT 
DisplayWrite 5 
FrameMaker 
Interleaf 

Lotus 1-2-3 


Excel for the Macintosh | < 


Excel 3.0 
Excel 4.0 


RTF (Rich Text Format) _ 
RTF for the Macintosh _ 
Multimate Advantage _.. 


PostScript 
Professional Write 2.0 


Professional Write2.2_ 


Samna Word IV+ 
Windows Write 


Word for Windows 2.0 : 


WordPerfect 4.2 
WordPerfect 5.1 
WordStar 
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" File Extension 
, -sam 
: -sam 
; -txt 
: ~-txt 
-txt 


SUPPORTED SCANNERS 


Using built-in ISIS (Image and Scanner Interface 
Standard) drivers from Pixel Translations Inc., Text- 
Bridge works with most scanners in the PC market: 


Apple OneScanner 

Canon IX-12 

Complete PC scanner 

Ficus LEOscan 610 

Ficus LEOscan 1210 

Envision 6000 

Envision 6100 

Envision 8100 

Epson ES300C (GT-4000 outside U.S.) 
Epson ES600C (GT-6500 outside U.S.) 
Epson ES800C (GT-8000 outside U.S.) 
Fujitsu ScanPartner 10 

Fujitsu M3096G 

Fujitsu M3097G 

HP ScanJet 

HP ScanJet Plus 

HP IIc 

HP IIp 

HP IIcx 

Microtek MS-II 

Microtek ScanMaker II 

Microtek 600Z 

Panasonic FX-RS307 

Relisys Aries 1201 

Tamarack 6000c 

UMAX UC-630 scanner with GSII-PC card 
XIS Datacopy GSplus 

XIS Datacopy 730GS 


In addition, TextBridge supports the TWAIN 
standard. 


Thus, TextBridge works with any fully TWAIN- 
compliant scanner or other device that connects to a 
PC and produces binary (black-and-white) images in a 
supported size and resolution. 
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Note 


For example, a number of hand scanners, such as the 
Logitech™ ScanMan® and Artec WalkScan™, are 
provided with TWAIN source drivers. 


In addition, some of the scanners that TextBridge 
supports through ISIS also are provided with TWAIN 
source drivers—for example, the Ficus LeoScan™ . 


The list of scanners that work with TextBridge is 
always growing. With TWAIN support, and the 
growing list of scanner manufacturers who are 
providing TWAIN drivers, TextBridge will work with 
many scanners. Refer to the on-line TextBridge 
Release Notes for the latest information. 


WHAT COMES WITH TEXTBRIDGE 


Note 


TextBridge is provided on several 3.5-inch high- 

density diskettes. The diskettes include software 
programs and libraries, ISIS drivers, language 

packs, and on-line help. 


The TextBridge product also includes this user’s guide 
and a quick-reference card, as well as a software 
registration card. 


Be sure to fill out the software registration card, as it 
entitles you to technical support, and assures that you 
are kept up to date on new software releases. 


If any piece is missing from your TextBridge package, 
call your authorized Xerox Imaging Systems’ dealer or 
reseller. If you are unable to solve the problem, you 
can call Xerox Imaging Systems directly. 


For information about contacting Xerox Imaging 
Systems, refer to the “Customer Support” section in 
the Preface of this user’s guide. 
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SYSTEM REQUIREMENTS 
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Note 


To install and run TextBridge, your Windows- 
compatible PC must be equipped with the following: 


an Intel (or compatible) 80386, 80486, or 
Pentium™ microprocessor 


Microsoft Windows™, version 3.1 or later running 
in enhanced mode only 


Microsoft Disk Operating System (DOS)™, 
version 5.0 or later 


four megabytes (4Mb) of random access memory 
(RAM); eight megabytes (8Mb) is recommended 


8Mb to 16Mb of permanent virtual memory; 
16Mb is recommended 


t= For information about configuring your 
system with permanent virtual memory, refer 
to the Microsoft® Windows™ Version 3.1 
User’s Guide published by Microsoft 
Corporation. 


a hard disk with a minimum of 4Mb of free space 
in which to install TextBridge; the 4Mb minimum 
disk space requirement enables installation of all 
TextBridge application software, one ISIS scanner 
driver, and one language pack. Please allow 700K 
for each additional language pack you intend to 
install. 


TextBridge runs under IBM’s OS/2 operating system, 


Version 2.0 and later, which operates many Windows 
3.1 programs. On OS/2 systems, TextBridge requires 
16Mb of RAM. 


TextBridge User’s Guide 


ON-LINE HELP 


TextBridge is designed to be easy to learn and use. 
However, if you need assistance, TextBridge provides 
a complete hypertext-based on-line Help system. 


In any of TextBridge’s screens, you can select Help 
and display a Help window about that particular 
screen (Figure 1-3). 


= Windows Help [~ || 
File Edit Bookmark Help 


Main Dialog 


Buttons to 
move through 
Help text 


The TextBridge main dialog appears when you first launch the TextBridge application. The 
main dialog enables you to specify that text to be recognized is coming from an on-line 
scanning and recognition, and to access pr 
listed and described below: 


Input Frorn 


text from binary (black and white) TIFF files only. The resolution of 
the TIFF image must be one of 200-by-100, 200-by-200, 
300-by-300, 400-by-200, or 400-by-400 dots per inch (dpi). The 
maximum allowable size of the TIFF file is dictated only by the 
amount of RAM and virtual memory (swap space) your PC is 
configured with. When you select File, then select GO!, 
TextBridge displays a standard Windows Open dialog to specify 
the TIFF file narne and location. 


Figure 1-3. TextBridge on-line Help 


By selecting the Contents button near the top of the 
window, you can display the top-level index for 
TextBridge Help. You can move through the 
TextBridge Help system in a number of ways: 


e by selecting a topic from the Contents list 

e by choosing a jump 

e by searching for a topic 

e by browsing forward or backward 

e by backtracking 

In any Help window, you can pull down the Help 


menu and select the “How to Use Help” topic for 
complete information about using Help tools. 
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GETTING STARTED 


1-10 


Before you can use TextBridge, you must install it on 
your PC as directed in Chapter 2. 


t= If you run into any problems installing 
TextBridge, refer to Appendix A of this user’s 
guide, which provides troubleshooting tips. 


After you successfully install TextBridge, you can 
refer to the Quick Reference Card for a quick-start 
procedure. 


Alternatively, for more detailed step-by-step operating 
procedures, refer to Chapter 3 of this guide. 


After you become familiar with the basic operation of 
TextBridge, refer to Chapter 4, “Tips and Techniques.” 
This chapter provides advice for getting the most out 
of TextBridge. 
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Note 


_2| INSTALLATION 


This chapter describes the TextBridge installation 
process, which requires as many as three stages: 


1. Follow the scanner manufacturer’s instructions to 
install and test the scanner and DOS system-level 
driver, or TWAIN source driver. 


t= The scanner’s DOS system-level driver or 
TWAIN source driver should be provided by 
the scanner manufacturer. 


2. Install the TextBridge software, including 
language packs for the languages you will be 
recognizing. 


3. Run the TextBridge scanner setup program and 
test the scanner interface. 


If you own an XIS Datacopy GSplus or 730GS scan- 


ner, the correct system-level drivers are provided with 
TextBridge. For information, refer to “Installing and 
Testing the Scanner” later in this chapter. 


This chapter organizes information in the installation 
sequence listed above. 


However, before describing the installation steps in 
more detail, this chapter discusses issues relating to 
“System Configuration and Performance.” 


Please read that section, as it describes ways to avoid 
problems associated with memory limitations and 
inefficient use of your PC’s resources with TextBridge. 


Also, at the end of the chapter is a de-installation 
procedure in case you want to restore your PC to its 
original state. 
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SYSTEM CONFIGURATION AND PERFORMANCE 


Important 


TextBridge operates under Windows 3.1 (or later) in 
enhanced mode only. Thus, your PC must have at 
least an Intel 80386 microprocessor. 


Also, to run TextBridge, your PC must have at least 
four megabytes (4Mb) of memory (RAM), and must 
be configured with at least eight megabytes (8Mb) of 
permanent virtual memory (swap space). 


If you regularly intend to scan multiple-column or 
landscape pages of text, or pages with complex 
layouts, you should configure your PC with eight 
megabytes (8Mb) or more of system memory (RAM) 
and 16Mb of virtual memory. 


With only 4Mb of memory, the minimum requirement, 
TextBridge will more often use virtual memory 
(swap space on your hard disk). 


If you try to run TextBridge with less than 4Mb of 
RAM, the application informs you that it: 


Cannot initialize server. 


In general, the more RAM that is available when you 
use TextBridge, the less swapping to disk will be 
required during operation. 


As a rule of thumb, you should configure your system 
with twice as much virtual memory as RAM. Thus, if 
you have 4Mb of memory, configure your system with 
at least 8Mb of virtual memory. If you have 8Mb of 
RAM, specify 16Mb of virtual memory. 


It is necessary to configure virtual memory as perma- 


nent virtual memory, especially for PCs with only 
4Mb of RAM. This assures that TextBridge always 
has an adequate amount of contiguous swap space 
during OCR. Refer to Appendix A of this manual, or to 
your Microsoft® Windows™ User’s Guide, for 
information about configuring virtual memory. 
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Regardless of your RAM and virtual memory, a 
number of other system configuration choices can 
affect the availability of memory to TextBridge, and 
thus can affect performance. 


Following is a list of items that can affect TextBridge 
system performance: 


e RAM disks 


If you have set up your system to use part of your 
extended memory as temporary file storage, called 
a RAM disk, this subtracts from available 
memory. 


e TSR (terminate-and-stay-resident) programs 


Some programs are designed to automatically load 
into memory when you start your system, or to 
stay in memory even after you exit them. These 
programs also affect the memory available to 
TextBridge. 


e Expanded Memory Drivers 


These are programs that use extended memory as 
expanded memory (memory used by the operating 
system), for example, a windows driver. 


e Other Drivers 


These are programs that provide some type of 
system control, for example, a network driver. 


If you find that TextBridge’s performance seems slow, 
check to see if your system is configured with any of 
these devices. If there are one or more of these devices 
that you can do without, remove them to improve 
TextBridge performance. 
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INSTALLING AND TESTING THE SCANNER 


Note 


Using ISIS drivers provided by Pixel Translations 
Incorporated, TextBridge works with many popular 
desktop scanners. 


In addition, with its support of the TWAIN standard, 
TextBridge works with virtually any fully TWAIN- 
compliant device that provides a binary image in a 
supported size and resolution. Supported TWAIN 
devices include a number of hand-held scanners. 


The full list of scanners supported by TextBridge is 
always growing. Check the on-line TextBridge Release 
Notes to find the latest list of supported scanners. If 
your scanner is not in this list, call your authorized 
Xerox Imaging Systems’ reseller, or call XIS Customer 
Support directly. 


Scanners generally require a system-level driver or a 
TWAIN source driver, which is provided by the 
scanner or interface card manufacturer. Consult the 
scanner documentation for details about instal- 
ling your scanner, interface card, and driver. 


XIS Datacopy GSplus or 730GS scanner owners 
should refer to the section, “If you have an XIS 
Datacopy scanner,” for information about the correct 
system-level drivers to use with TextBridge. 


Basic scanner installation steps 


244 


The basic steps for installing a scanner are to: 


1. Install the correct scanner interface card (if one is 
necessary) in the PC bus. 


2. Hook up the scanner to the interface card (or with 
some hand-held scanners, to the serial port) with 
the correct cable, and power up the scanner and 
the PC. 
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3. Install the system-level scanner driver (. SYS) file, 
or TWAIN source driver on your PC hard disk, as 
directed by the scanner documentation. 


4. Test the scanner using software tools provided by 
the manufacturer. After the scanner is function- 
ing, go on to install TextBridge software. 


t= If your scanner runs independently of Text- 
Bridge, you can be sure that it is functioning 
correctly. Setting it up to run with TextBridge 
should then be a simple matter. 


If you have an XIS Datacopy scanner 


TextBridge works with two scanners from XIS Data- 
copy: the GSplus (using the Rancho Technology 1201 
card) and the 730GS (using the Datacopy 111 card). 


If you purchased one of these scanners to run with 
XIS DISCOVER software, that product provided a 
different system-level driver than the one needed to 
work with TextBridge. The correct system-level 
drivers for the XIS Datacopy scanners are included on 
the TextBridge installation disks. 


Install TextBridge according to the instructions in 
“Installing TextBridge Software,” later in this 

chapter. Then reference the appropriate system-level 
driver in your CONFIG. SYS file. The drivers are stored 
in a subdirectory, named DATACOPY, beneath the 
TextBridge installation directory, the default of which 
is C: \TXBRIDGE. 


For the GSplus, the system-level driver is named 
XIS380GS.SYS. To use the GSplus scanner with 
TextBridge, you must use this system-level driver. 


The DEVICE statement in the CONFIG. SYS file for the 
GSplus driver would be: 


DEVICE=C: \TXBRIDGE\DATACOPY\XIS380GS.SYS 
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Note 


For the 730GS, the system-level driver is named 
XISDCP30.SYS. You must use the version supplied 
with TextBridge. Thus, the DEVICE statement in the 
CONFIG. SYS file would be: 


DEVICE=C: \TXBRIDGE\DATACOPY\XISDCP30.SYS BASE=2E8 


The XIS Datacopy 111 Card is the appropriate 
interface card for the 730GS. Its default port I/O 
address is 2E8. 


If the 2E8 address conflicts with other cards you have 


installed in your PC, you can change the address by 
changing the DIP switches. 


The following alternative address settings for the 
Datacopy 111 card will work: 


218, 228, 238, 248, 258, 268, 278, 
288, 298, 2A8, 2B8, 2C8, 2D8 


If you change the default address, make sure to 
change the DEVICE statement in your CONFIG.SYS 
file appropriately. For example, if you set the DIP 
switches on the card to 2D8, the DEVICE statement in 
the CONFIG. SYS file should be: 


DEVICE=C: \TXBRIDGE\DATACOPY\XISDCP30.SYS BASE=2D8 


INSTALLING TEXTBRIDGE SOFTWARE 
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Note 


Installing TextBridge software is a two-phase process. 
First, you install the software files. Second, you run a 
scanner setup program to link TextBridge to your 
scanner’s system-level or TWAIN source driver. 


If you are not using TextBridge with a supported 
scanner, you can ignore the second installation phase, 
running the scanner setup program. For example, you 
can use TextBridge to recognize TIFF files produced 
by fax modems. 
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Run the software setup program 


Note 


Click to begin 
installation 


To install TextBridge software, use the following 
procedure. 


This procedure assumes that, if you are using a 
supported scanner, it is already connected to your PC 
and is operational. 


1. Insert TextBridge Disk 1 into drive A: or B: 
as appropriate. 


2. From the Windows Program Manager, run 
the TextBridge SETUP command. 


e Pull down the File menu and select the Run 
command. 


e Inthe Run dialog, enter the following: 
b:setup 


This assumes that B: is the drive you are 
using. Use A: instead, if appropriate. 


e Press Enter. An initialization message 
appears, followed by the TextBridge main 
setup dialog (Figure 2-1). 


= TextBridge 2.0 Installation 


Welcome to the TextBridge 2.0 OCR for 
Windows setup procedure. 


This installation will install the TextBridge 2.0 OCR 
software onto your hard disk. 


Continue 


Figure 2-1. TextBridge main setup dialog 
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3. Click Continue to begin the installation of 
TextBridge on your hard disk. 


A dialog now displays the TextBridge installation 
directory on your hard disk. The default is: 


C:\TXBRIDGE 


4, Click Continue to proceed. 


A dialog appears enabling you to choose the OCR 
language packs you want to install. 


t= You must select at least one language pack in 
order for TextBridge to perform OCR. 


5. Specify the language packs to be installed. 


e Click the checkbox on for each language pack 
to be installed. 


t= TextBridge supports OCR of documents 
printed in English, French, Italian, Ger- 
man, and Spanish. European versions of 
TextBridge support even more languages. 
Allow approximately 700K of hard disk 
space for each language pack. 


e Click Continue. The setup program begins 
installing TextBridge software files, as 
indicated by a percentage meter on the screen. 


6. Insert TextBridge installation disks as 
instructed. 


As the setup program decompresses and copies 
files to your hard disk, it periodically requests you 
to insert another of the TextBridge installation 
disks. Insert each disk and click Continue. 


When the setup program has installed all 
necessary files from the installation disks, it 
displays a dialog informing you so. 
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7. Click OK in the dialog. 


The setup program automatically creates the 
TextBridge OCR program group and opens it on 
your screen (Figure 2-2). 


TextBridge OCR 


Scanner Setup Release Notes 


TextBridge 
Application 
Server 


Figure 2-2. TextBridge OCR program group 


8. If you are using a supported scanner, go on 
to run the Scanner Setup program. 


See the next subsection. 


If you intend to process on-line TIFF images only, 
skip scanner setup and begin using TextBridge, as 
described in Chapters 3 and 4 of this manual. 


Run the scanner setup program 


To use TextBridge with a supported scanner, run the 
Scanner Setup program. Scanner Setup links 
TextBridge to your scanner’s system-level or TWAIN 
source driver. As Figure 2—2 shows, the Scanner 
Setup program icon is available in the TextBridge 
OCR group window. 


This section provides two procedures, one to install 
and test a TWAIN source driver; the other, to install 
and test an ISIS driver. 


t= Ifyou have problems getting your scanner to work 
with TextBridge, consult Appendix A of this guide. 
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Install and test a TWAIN source driver 


To install and test a TWAIN source driver, use the 
following procedure: 


1. In the TextBridge OCR program group, 
double-click the Scanner Setup icon. 


The Scanner Setup main window appears (Figure 
2-3). Note that the default scanner type is ISIS. 


Default scanner 
type is ISIS =| Scanner Setup [ISIS] 


Type 


You can choose 
TWAIN instead 


Figure 2-3. Scanner Setup main window 
2. Specify the scanner type. 


e Pull down the Type menu, and choose TWAIN 
as the scanner type. 


t= Select TWAIN only if your scanner is 
provided with a TWAIN source driver. 


If your scanner is provided with a DOS 
system-level driver (. sys file) that must 
be referenced in a DEVICE statement in 
the config.sys file, select ISIS instead, 
and refer to the next subsection, “Install 
and test an ISIS driver.” 


Check your scanner documentation for 
information about the provided driver(s). 


e Goon to specify the TWAIN source driver to 
be used by TextBridge. 
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3. Specify the TWAIN source driver. 


e Pull down the File menu, and choose the 
Select Source command. The TWAIN Select 
Source dialog is displayed (Figure 2-4). 


Select Source 


Sources: 


Highlight the Logitech ScanMan 
TWAIN source 


Click to select 
source for 
TextBridge 


Figure 2-4. TWAIN Select Source dialog 


e All properly loaded TWAIN source drivers are 
shown. Highlight the one you want TextBridge 
to use. 


e Click the Select button to complete the 
selection. 


4. Test the scanner interface. 


e Pull down the File menu and select the 
Acquire command. The TWAIN source’s 
native user interface should now appear. 
For example, Figure 2-5 shows a version of 
the Logitech™ ScanMan® native user 
interface. 


e Use the TWAIN source’s native UI to specify 


and acquire an image. Refer to the 
documentation that came with the device. 
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TWAIN source 
native interface 
lets you set up 
and use the 
device 
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= Logitech ScanMan 


inches il 2 3 4 


Document 
Line art 


Figure 2-5. TWAIN source native UI example 


5. Exit the Scanner Setup program. 


Pull down the File menu and select Exit. You are 
now ready to use TextBridge with your TWAIN 
device. Refer to Chapters 3 and 4 for information. 


Install and test an ISIS driver 


To install and test an ISIS driver, use the following 
procedure: 


1. In the TextBridge OCR group window, if you 
have not already done so, double-click the 
Scanner Setup icon. 


The Scanner Setup main window is displayed 
(refer to Figure 2-3). Note that the default 
scanner type should be ISIS. 
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2. Specify the scanner type. 


e Pull down the Type menu, and, if necessary, 
choose ISIS as the scanner type. 


t= Select ISIS only if your scanner is provi- 
ded with a DOS system-level driver (. sys 
file) that must be in a DEVICE statement 
in the config.sys file. TextBridge 
supplies the higher-level ISIS driver that 
should work with your system-level driver. 
If your scanner vendor supplies an ISIS 
driver, use it instead of the one supplied 
with TextBridge. 


If your scanner is supplied only with a 
TWAIN driver, select TWAIN instead, and 
refer to the previous subsection, “Install 
and test a TWAIN source driver.” 


Check your scanner documentation for 
information about the provided driver(s). 


e Go on to specify the ISIS driver to be used by 
TextBridge. 


3. Pull down the File menu and choose the 
Select Source command. 


The ISIS Scanner Selection dialog appears 
(Figure 2-6). 


= Scanner Selection 


Scanner: 


Click to add your 
scanner 


Figure 2-6. Scanner Selection dialog 
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4, Display the list of available scanner drivers. 


e Click the Add button in the Scanner Selection 
dialog. A dialog appears requesting you to: 


Insert disk containing scanner 
drivers 


e Insert Disk 1 of the TextBridge installation 
disks in the disk drive, and, if necessary, 
identify the disk drive in the dialog (A: is the 
default). 


e Click OK to display the ISIS Add Scanner 
dialog (Figure 2-7). 


=> Add Scanner 
Add Scanner: 
Select your Apple OneScanner 
scanner Artiscan and Tamarack SCSI Scanners 


Canon IX-12 Flatbed (Adf Optional) 

Complete PC Scanner 

Datacopy 730, 730GS., and 830, no ADF 
Datacopy GS plus 

Epson GT-4000..GT-8000 & ES-300C..ES-800(,+) 


Then click OK 


Figure 2—7. Add Scanner dialog 
5. Add your scanner. 


e From the Add Scanner dialog, click the listing 
for your scanner to highlight it. 


e Click OK. The scanner is added, the Add 


Scanner dialog closes. The Scanner Selection 
dialog remains displayed (refer to Figure 2-5). 
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6. If necessary, further define your scanner 
configuration for TextBridge. 


e Inthe Scanner Selection dialog, click the 
Setup button. 


t= For some scanners, a dialog appears 
enabling you to define settings such as 
Port Address, SCSI ID Number, Transfer 
Mode, Scanning Speed, and so on. 


For other scanners, a dialog appears 
indicating simply that: 


This scanner’s configuration is set 


using the system-level driver. 


e If applicable, specify appropriate settings for 
your scanner configuration. Refer to your 
scanner or interface card documentation for 
details about scanner configuration settings. 


e When you are finished specifying scanner 
configuration settings, click OK to save the 


new settings and close the scanner dialog. 


7. Click OK in the Scanner Selection dialog. 


The Scanner Selection dialog closes leaving the 
Scanner Setup main window displayed. 
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8. Test the scanner. 


e Place a page in the scanner ADF or on the 


flatbed. 


e Inthe Scanner Setup main window, pull down 
the File menu and select Acquire. 


The scanner should activate and scan the page. If 
it does not, repeat all the installation steps 
described in this chapter. Also, refer to Appendix 
A of this guide for troubleshooting information. 


If you still cannot get your scanner to work, call 


XIS Customer Support. 


9. Exit the Scanner Setup program. 


Pull down the File menu and select Exit. You are 
now ready to use TextBridge with your ISIS scan- 
ner. Refer to Chapters 3 and 4 for information. 


DE-INSTALLING TEXTBRIDGE 
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To restore your PC to the state 
installed TextBridge, use the fo 


it was in before you 
llowing procedure: 


1. From the Windows Program Manager, delete 
the TextBridge OCR program group. 


e Select the TextBridge OCR program group by 
clicking on it with the mouse. 


e Pull down the Program 


Manager File menu 


and select Delete. A message asks you: 


Are you sure yo 


u want to delete 


the group ‘TextBridge OCR’ ? 


e Click Yes. 
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2. From the Windows File Manager, delete the 
TXBRIDGE folder from your hard disk. 


e Select the TXBRIDGE folder by clicking on it 
with the mouse. 


e Pull down the File Manager File menu and 
select Delete. The Delete dialog appears with 
the full pathname of the TXBRIDGE directory 
highlighted. 


e Click OK. A confirmation dialog appears. 


e Click the “Yes to All” button. Another 
confirmation dialog appears for a system 
(hidden) file in the TextBridge directory. 


e §6Click Yes. 


3. From the File Manager, delete the Text- 
Bridge initialization files. 


e Double-click the Windows directory. This 
displays the list of files in the directory. 


e Scroll to display the following files: 


TXBRIDGE.INI 
OCRSRV.INI 


e Delete each file by clicking on it with the 
mouse, pulling down the File Manager File 
menu and clicking Delete. This displays a 
Delete dialog box. 


e For each file, click OK in the dialog box. The 
files are deleted. 
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4. Optionally, edit the CONFIG.SYS file to 
remove the DEVICE statement for your 
scanner’s system-level driver. 


t= You will want to remove this only if you are 
not running another application that uses the 
system-level driver. 


With the above steps completed, TextBridge is 
completely de-installed from your PC. 
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3] USING TEXTBRIDGE 


After you have installed TextBridge as described in 
Chapter 2, you are ready to use the application to turn 
paper documents and on-line page images into usable 
text files. 


At your direction, TextBridge can scan a document, or 
read on-line TIFF images, and perform optical 
character recognition (OCR). 


TextBridge can also display pages in a Preview 
window, where you can view, zoom, and zone the 
page before processing. 


During OCR, you can verify recognized words as 
correct, or fix recognition errors. By verifying text, you 
teach TextBridge to improve recognition accuracy for 
the rest of the job. 


After TextBridge performs OCR, you can direct the 
application to convert the recognized text to a text 
format that you can use with your favorite word 
processing, desktop publishing, spreadsheet, 
database, or other application. 


This chapter describes and provides step-by-step 
procedures for using the main TextBridge application. 


tS For information about the TextBridge Application 
Server and OCR Printer, see Chapter 4. 


Specifically, this chapter covers the following topics: 
e About TextBridge preferences 


e Scanning and converting a document 

e Recognizing and converting on-line TIFF files 
e Previewing pages before processing 

e Verifying text during recognition 
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ABOUT TEXTBRIDGE PREFERENCES 


Click to set 


scanner 
settings 
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TextBridge optical character recognition software has 
a built-in set of preferences, settings that control the 
OCR process. For most good-quality office documents, 
you can use TextBridge default preferences to provide 
excellent character recognition. 


However, to fine-tune the OCR process for a variety of 
documents, TextBridge enables you to define prefer- 
ences on a job-by-job basis. Before starting OCR, you 
can click the Preferences button in the Main dialog. 
This displays the Preferences dialog (Figure 3-1). 


= Preferences 


,Document Quality 


@ Standard [” Auto Page Segmentation 


© Fax Ignore Photos/Halftones 


;Page Orientation 


@ Portrait 
C Landscape 


Note 


jLanguage Scanner Settings... 
— “ 


Figure 3-1. Preferences dialog 


Specify one or more of the preferences to control how 
TextBridge will process your document(s). Table 3-1 
describes the options available to you. 


When you change a preference, it remains in place 
until you change it again, even when you exit Text- 
Bridge and run the program again later. This lets you 
lock in place the preferences that are appropriate for 
documents you process most often. Of course, you can 
always change one or more preferences at any time. 
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Preference 


Document 
Quality 


Page 
Orientation 


Language 


Auto Page 


Segmen- 
tation 


Table 3-1. Preferences 
Function 


Choose Standard for most documents. 


Choose Fax for on-line, fax-quality TIFF images 
(200x100 or 200x200 dots per inch), or if you are 
scanning hard copy faxes. 


Click Portrait for most typical portrait-oriented 
office documents. 


Click Landscape for wide documents that are 
scanned in sideways, and thus must be rotated in 
memory by 90-degrees before recognition begins. 


Click Auto if your document contains a mixture of 
page orientations (for example, mostly portrait 
pages, with large rotated tables on landscape 
pages). Note that you may also want to click Auto 
if you are recognizing TIFF files and are not sure 
how the page image is oriented in the file. With 
Auto selected, TextBridge performs a pre- 
processing step to determine the orientation of the 
page. Thus, overall processing speed is slower 
with this option turned on. 


This category provides a pop-up menu of the Text- 


Bridge recognition languages that you have installed 
on your system. TextBridge can perform highly 
accurate optical character recognition on documents 
in German, French, Italian, and Spanish, as well as 
English. Select the primary language of the 


document to be recognized (for example, German). 


Leave off for single-column documents. 


Click the checkbox on for documents with two or 
more columns of text. With page segmentation on, 
TextBridge performs a pre-processing step to 
analyze the shape and location of text blocks on 
the page, so they are output in the right order. 
Note that, in the converted text file, TextBridge 
outputs text in galley (single-column) format. 
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Preference 


Ignore 
Photos/ 
Halftones 


Scanner 
Settings 


3-4 


Table 3-1. Preferences (cont.) 
Function 


Click the Ignore Photos/Halftones checkbox on if 
the document to be recognized contains photographs 
or halftones (screened photographs). Otherwise, 
TextBridge tries to recognize the halftone dots as 


text, reducing recognition accuracy and speed. 


Click the Scanner Settings button to display the 
Scanner Settings dialog, in which you can specify 
scanner-specific controls. 


e Use Automatic Document Feeder is available 
only for scanners that have attached automatic 
document feeders (ADF). When you click the 
checkbox on, TextBridge is instructed to scan and 
recognize pages from the ADF. 


e Brightness enables you to control the amount of 
light your scanner shines on a page as it scans it, 
thus affecting the lightness or darkness of the 
scanned page image. You can adjust brightness to 
compensate for different original documents and 
improve the recognition process. 


t= If you are using an HP scanner with HP 
AccuPage, Auto is the correct brightness 
setting. 


e Resolution lets you control the number of dots 
per inch (dpi) at which the page(s) will be scanned. 
In general, for best character recognition results, 
specify the highest resolution, up to 400 dots per 
inch, that your scanner allows. 


e Page Size lets you control the size of the area the 
scanner will scan. In general, specify the smallest 
size that will accommodate the size of your 
original pages: Letter (8.5-by-11 inches or 21.59- 
by-27.94 centimeters); Legal (8.5-by-14 inches or 
21.59-by-35.56 centimeters); or A4 (8.27-by-11.69 
inches or 21-by-29.70 centimeters) 
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SCANNING AND CONVERTING A DOCUMENT 


Note 


One of the tasks that TextBridge enables you to 
accomplish is scanning a hard copy document into an 
on-line text file. The document can comprise one page 
or many pages. 


During this process, TextBridge scans the first page, 
performs OCR on it, and saves the recognized text in a 
temporary file. It repeats the process for each 
subsequent page, appending the recognized text to the 
temporary file. 


When you inform TextBridge that the entire 
document has been scanned, TextBridge displays the 
Save As dialog. In this dialog, you specify the output 
file name, disk and directory, and the text format to 
save it in. 


The following procedure assumes that the scanner is 
properly connected to your PC and is powered on and 
ready. The procedure also assumes that TextBridge is 
installed and running. 


To scan and convert a document, complete the 
following steps: 


1. Load the page(s) into your scanner. 


Depending on your scanner, you can load a stack 
of pages in the automatic document feeder (ADF), 
or a single page on the scanner’s flatbed. 


2. From the Main dialog, click Scanner in the 
Input From box (Figure 3-2, next page). 


3. To save an image of each page to a TIFF file, 
click the Save Page Images check box. 


t= Page images are stored as TIFF files with 
CCITT Group 3 compression. See Chapter 4 
for more information about Save Page Images. 
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Specify 
scanner as the 
input source 
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Input From- 


TextBridge 


@ Scanner 


a 
ice # 


Figure 3-2. Scanner as input source 


4. Optionally, click the Preferences button to 


further define the scanning and OCR process. 


Use Table 3-1 as a guide. With Preferences, you 
can specify such things as page orientation, 
recognition language, and so on. When you are 
done specifying preferences, click OK to return to 
the Main dialog. 


Click GO! in the Main dialog to begin the 
scanning and recognition process. 


If you are using a TWAIN scanner, the native 
user interface of the source driver appears. Here 
you can control the scanning process. 


If you are using an ISIS scanner, the page(s) in 
the scanner are scanned and recognized 
automatically. 


When you dismiss the native UI (TWAIN), or the 
page(s) are all scanned (ISIS), TextBridge displays 
the Add More Pages dialog (Figure 3-3). 


=> Add More Pages 


Add more pages to the scanner 


© and click Continue 


Or click End if you are done 


Figure 3-3. Add More Pages dialog 
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6. Toscan more pages, go to step 7. To end 
scanning and OCR, go directly to step 8. 


7. Prepare your scanner, then click Continue. 


If you are using a TWAIN scanner, the native UI 
reappears, and you can resume scanning and 
recognition. If you are using an ISIS scanner, 
TextBridge automatically resumes scanning and 
recognition. When you are finished, TextBridge 
again displays the Add More Pages dialog (refer to 
Figure 3-3). Proceed from step 6. 


8. Click End in the Add More Pages dialog. 


TextBridge now displays the Save As dialog 
(Figure 3-4). 


File Name: Directories: 
[untitied. san] c:\txbridge\bin 
ch L_Cancet_] 


 txbridge 


List Files of Type: Drives: 


[Ami Pro 3.0 (*.SAM) +} c: +] 


Figure 3-4. Save As dialog 


9. Specify the output file information. 


e Specify the file name, destination disk and 
directory. 


e Also specify the output text format in the Save 
File as Type pop-up menu. 


e Click OK. 


TextBridge converts the recognized text to the 
specified format, saves the converted text to the 
specified file, and closes the Save As dialog. The 
Main dialog remains, ready for the next job. 
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RECOGNIZING AND CONVERTING TIFF FILES 


If you have one or more TIFF files that contain page 
images, you can use TextBridge to produce a text file 
from them. To TextBridge, an on-line TIFF image file 
is similar to one or more scanned page images. 


Some TIFF files contain more than one page image. 
These multiple-page TIFF files are most commonly 
generated from fax modems. TextBridge is equally 
capable of processing both single- and multiple-page 
TIFF files. 


TextBridge can process TIFF (* . TIF) files in the 
resolutions and formats described in Table 3-2: 


Table 3-2. Supported TIFF Resolutions and Formats 


Resolutions* TIFF Formats 
100-by-200 Uncompressed (Intel header) 
200-by-100 CCITT-3 (Intel header) 
200-by-200 CCITT-4 (Intel header) 
300-by-300 Uncompressed (Motorola header) 
400-by-200 CCITT-3 (Motorola header) 
400-by-400 


CCITT-4 (Motorola header) 
Intel FAXability 


All TIFF files processed to one document must be 
of the same resolution. If TextBridge has 
processed several files that are 200-by-200 dpi, for 
example, and the next file that it encounters is 
300-by-300 dpi, the program generates an error 
message, then displays the Save As dialog box so 
you can save the recognition results up to that 
point to an output document. 
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Note 


At your direction, TextBridge opens the TIFF file(s), 
reads the image data into computer memory, performs 
OCR on the image(s), and then saves the recognized 
text to a formatted text file. 


The following procedure assumes that TextBridge is 
properly installed on your PC and running. 


To recognize and convert one or more on-line TIFF 
files to a text file, use the following procedure: 


1. From the Main dialog, click File in the Input 
From box (Figure 3-5). 


TextBridge 


ia From- 


Specify TIFF ——_7—© Eile 7 
files(s) as the dy [cancer _| ance 


input source 


Figure 3-5. Input From File 


2. Optionally, specify Preferences to control 
the recognition process. 


e Click the Preferences button to display the 
Preferences dialog (refer to Figure 3-1). 


e Change one or more of the preferences in 
place. Refer to Table 3-1 for information. 


e When you are done specifying preferences, 
click OK to return to the Main dialog. 


Ai 
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3. Click GO! in the Main dialog. 


TextBridge displays the Open dialog (Figure 3-6). 


File Name: Directories: [ook CO 
[quid0005. tif o:\tiff Lox) 
Highlight the — | content.tif : 
TIFF file(s) to cover.tif | 
be processed serdO001 ti 
guid0002 tif 
quid0003.tif 
cack quid0005. tif 
List Files of Type: 
Add [TIFF Files (*.TIF) +} 
Files to process: 3 A 
Selected z= c:\tff\cpyright. tif 
added to the 
queue 
Figure 3-6. Open dialog 
4, Specify the TIFF file(s), then click OK. 
t= To select a single file, click on it in the File list 
box, then click Add to add it to the Files to 
process list. Or, simply double-click the file. 
To select a range of files, point to the first file 
in the range, click, hold, and drag the mouse 
downward in the File list box. Or, click the first 
file in a range, then Shift-click (hold down the 
Shift key and click the mouse) on the last file 
in the range. With the file range selected, click 
Add to add the files to the process list. 
To select TIFF files that are not in sequence, 
click the first file, then Control-click (hold 
down the Control key and click the mouse on) 
subsequent files in the file list. 
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tS To add all files in the current directory to the 
process list, click the Add All button. 


To add files from another directory, change to 
the other directory, and add the files as noted 
here. Note that the full pathnames of the files 
are added to the process list. 


To delete files from the process list, you can 
select one at a time, and click the Delete 
button. Or, you can select a sequence of files, 
or a non-sequential group of files in the same 
manner as described above, and click the 
Delete button. 


Files are processed in the order in which 
they are added to the queue. As you add 
files, the number of files in the process list is 
tracked by a counter (Figure 3—7). 


‘List Files of Type. ==—<C«~*t*éiS rivers: 

[TIFF Files (7. TIF) [=] ic: | 
Count of added = l 
files is tracked Files to process: 5 
here 


co bfRAAcpynight. tif I 
co \orAguid0001 . tif 
co. guidd007. tif | 
| 
l 
l 


Files are pro- 
cessed in the 2 s : 
Sdd order c:\bEFA\guid0005. tif 


cbf guidd003. tit 


Figure 3—7. Multiple TIFF files to be processed 


After you add the file(s) and click OK in the Open 
dialog, TextBridge performs OCR on the page 
image(s). When OCR is complete, TextBridge 
displays the Save As dialog (refer to Figure 3-4). 
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5. In the Save As dialog, specify the output file 
information. 


e Specify the file name, destination disk and 
directory. 


e Also specify the output text format in the Save 
File as Type pop-up menu. 


e When done, click OK. 


TextBridge converts the recognized text and saves 
the converted text to a file of the given name, disk 
and directory location. It then displays the Main 
dialog, ready for the next job. 


PREVIEWING PAGES BEFORE PROCESSING 


To view a page before processing, or to define areas of 
a page to process, TextBridge provides the Preview 
window (Figure 3-8). 


Preview 


Menu bar ————] View Process Help 
with preview 
commands 


MABEETLPNATK 


Mommy ces and 
seallalts EDGE 


Toolbox for 
zoning and 
zooming 


Full page ———— 
scaled to fit 
in the window 


Figure 3-8. Preview Window 
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In the Preview window, you can use the Zoom tools in 
the Preview toolbox to zoom in and out on the page at 
several magnification levels: 


Zoom In Zoom Out 


With the Zoom tools, you can magnify a page to full 
resolution, zoom out to fit the page entirely in the 
window, or display the image somewhere in between. 


To define a portion of the page to be processed, you 
can use the Text Zone tool to draw up to 127 rect- 
angular zones around specific areas on the page: 


Text Zone 


Zones are numbered to show the order in which you 
created them and the order in which contained text 
will be output in the finished file. Overlapping zones 
are opaque; the topmost zone “owns” any common text 
it shares with the underlying zone(s). 


The Select Zone tool lets you select any of the zones 
you have created: 


Select Zone 


A selected zone is identified by solid corner handles. 
After you select a zone, you can move it, resize it, 
delete it, or put it in front or in back of another zone. 


For processing purposes, you can adjust the zones 
page-by-page or have the zones take effect for all 

pages of the document. When the job is complete, 

TextBridge automatically clears the zones. 
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Note The following step-by-step procedure assumes that, if 
you are using a scanner, it is properly connected to 
your PC, powered on and ready. The procedure also 
assumes that TextBridge is properly installed and 
running. 


To preview a document before processing it, use the 
following procedure: 


1. If you are scanning, load the page(s) into 
your scanner, then go to step 2. If you are 
processing a TIFF file, start at step 2. 


2. In the Main dialog, define the input source. 


e Select either File or Scanner in the Input 
From box (refer to Figures 3—2 and 3-5). 


e Click the Preview box in the Main dialog. 


3. Optionally, click the Preferences button to 
further define the scanning and OCR process. 


Use Table 3-1 as a guide. When you are done with 
preferences, click OK to return to the Main dialog. 


4, Click GO! in the Main dialog. 


If you are scanning with a TWAIN scanner, the 
native UI for the scanner appears, and you can 
execute scanning from this interface. If you are 
scanning with an ISIS scanner, TextBridge 
automatically scans a page from the scanner. 


If you are reading the page image(s) from one or 
more TIFF files, TextBridge first displays the 
Open dialog (refer to Figure 3-6). In the Open 
dialog, specify the name(s) of the TIFF file(s), and 
their directory and drive location, then click OK. 


TextBridge opens the Preview window, and 
displays the first scanned or on-line page image 
(refer to Figure 3-8). 
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Zoom in to 
evaluate the 
image quality 


Use scroll 
bars to shift 
the display 


5. Optionally, zoom the page image. 


The page first appears zoomed out to fit in the 
window. To quickly zoom to full resolution, you 
can pull down the View menu and choose the 
Zoom Max command. To zoom in on the page 
image by steps, click the Zoom In icon from the 
Preview toolbox, and click on the page display. 


When the page is zoomed in, scroll bars appear to 
let you shift the display horizontally and vertically 
(Figure 3-9). 


To zoom all the way out, pull down the View menu 
and choose the Zoom Min command. To zoom out 
in steps, click the Zoom Out icon, and click on the 
page display. 
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Figure 3-9. Previewed page zoomed in 
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6. Optionally, create one or more zones to 
define the page area(s) to be processed. 


Up to 127 zones can be created, as follows: 


e Click the Text Zone tool, point to a corner of 
the area to be zoned, click and hold the left 
mouse button, and drag the mouse. A zone 
rectangle appears as you are moving the 
mouse. 


e When the zone is fully sized as you intend, 
release the mouse. 


tS The zone number appears in the upper left 
corner. Handles appear on the zone rec- 
tangle for resizing purposes (Figure 3-10). 


The zone number indicates the order in 
which recognized text blocks will be 
output in the finished text file. 


To resize a zone, click and hold ona 
corner handle and drag the mouse. 


To move a zone, click and hold inside the 
zone, and drag the mouse. 


To position a zone relative to another 
zone, make sure the zone is selected 
(handles are black), then pull down the 
View menu and select the Move to Front 
or Move to Back command. 


Overlapping zones are opaque, meaning 
that any image area shared by more than 
one zone is output as part of the topmost 
zone. 


To delete a zone, pull down the View 
menu and choose the Clear Zone command 
(or simply press the Delete key). 


3-16 TextBridge User’s Guide 


Preview 


View Process Help 


Zone number 4 Dpticnt Character Recoyntiion —NABBETLPMAIK 
Mommy costs and 
a LJ availalNlity Improve 


Walt lecar evoculr. lugou usclelincd HAF hae reed 
elapy erpllctraned agchauire stan; boa Scart str sivhvon as taubale 
Site Leased ¢ #3 OLR ipsa steorier rae te sourcaangay oer, ruta 


Tia of 
irom sie net Smmaciee re kning-eingAéecy | MEN eH! ee 
zal slliedine Ly Lilspttte aaa: avunleg eure * 
Selected zone eo yueepe noma ge gem 
; tr errs nr 
with handles ane Te sav ictes ScaL, aNicaiae Duss dl a rie te 
aah r 32 Segall, armere;e oF loo eed ane dy ee oe 
' 
for resizing seater ase vey agence | ee Aa 
om 


cp “ter ckIIe in add ines ret racemvamny,” wal Mr. la Bis Lannea Porc s ly 


Figure 3-10. Zones in Preview 


7. When you are done previewing the page, 
start the OCR process. 


Pull down the Process menu and select This Page 
or All Pages to start/resume OCR. 


t= Select All Pages to close the Preview window 
and process all pages of the job to the zones in 
place. 


Select This Page to preview and process every 
page individually. TextBridge will process the 
first page, scan, or read, and display the next 
page in the Preview window. TextBridge then 
goes into idle mode. Proceed from step 5 to 
preview the now-displayed page. 


When you have previewed the last page of the 
document, select the All Pages command to 
close the Preview window and have 
TextBridge finish OCR of the document. 


When OCR is completed, TextBridge displays the 
Save As dialog (refer to Figure 3-4). 
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8. Specify the output file information. 


e Specify the file name, destination disk and 
directory. 


e Also specify the output text format in the Save 
File as Type pop-up menu. 


e When done, click OK. 


TextBridge converts the recognized text and saves 
the converted text to a file of the given name, disk 
and directory location. The Main dialog remains, 
ready for the next job. 


VERIFYING TEXT DURING RECOGNITION 


To achieve the highest possible OCR accuracy, even 
on difficult documents, TextBridge provides a word 
verifier (Figure 3-11). 
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Figure 3-11. Word Verifier window 


The word verifier displays questionable words for 
you to correct and/or verify during recognition. 
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Note 


Questionable words are those that fall below a confi- 
dence threshold built into TextBridge. During OCR, 
TextBridge assigns a confidence value to each word. If 
the value falls below the confidence threshold, and 
you are using the Verifier, TextBridge displays the 
word as questionable. 


By correcting errors and verifying correct words, you 
help TextBridge improve its recognition accuracy for 
the rest of the document. TextBridge uses your input 
to improve recognition decisions as the job progresses. 


The Verifier window, which is shown in Figure 3-11, 
is similar to the Preview window. In part of the 
Verifier window, TextBridge shows a zoomed image of 
the page with the questionable word highlighted for 
context. 


Above the image area, zoom tools enable you to zoom 
further in or out on the page image: 


Zoom In Zoom Out 


Next to the toolbox is the Word edit box containing 
the recognized word, which is highlighted for 
correction or verification. To the right of the edit box 
are the Accept and Undo buttons. 


To verify words during OCR, use the following 
procedure. 


The procedure assumes, if you are scanning, that the 
scanner is properly connected to your PC and is 
powered on and ready. It also assumes that 
TextBridge is properly installed and running. 
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1. If you are scanning, load the page(s) into 
your scanner, then go to step 2. Otherwise, 
start at step 2. 


2. In the Main dialog, define the input source. 


e Select either File or Scanner in the Input 
From box (refer to Figures 3—2 and 3-5). 


e Click the Verifier checkbox in the Main dialog. 


3. Optionally, click the Preferences button to 
further define the scanning and recognition 
process. 


Use Table 3-1 as a guide. When you are done 
specifying preferences, click OK to return to the 
Main dialog. 


4, Click GO! in the Main dialog. 


If you are scanning with a TWAIN scanner, the 
native UI for the scanner appears, and you can 
execute scanning from this interface. If you are 
scanning with an ISIS scanner, TextBridge 
automatically scans a page from the scanner. 


If you are reading the page image(s) from one or 
more TIFF files, TextBridge first displays the 
Open dialog (refer to Figure 3-6). In the Open 
dialog, specify the name(s) of the TIFF file(s), and 
their directory and drive location, then click OK. 


TextBridge begins the OCR process. When it 
encounters the first questionable word, it opens 
the Verifier window, and displays the question- 
able word highlighted in a Word edit box (refer to 
Figure 3-11). Below the edit box, the Verifier 
window displays the image of the word 
highlighted for context. 
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5. Verify and/or correct the questionable word 
in the Word edit box, then click the Accept 
button. 


t= If you make a mistake during verification, 
simply click the Undo button. The edit box 
restores the last word you edited, and you can 
then correct the mistake. 


You can control the frequency of questionable 
words to verify with the Verify command, 
which has five possible settings. Pull down the 
Process menu, select the Verify command, and 
from its walking menu, select a setting from 
among Most, More, Normal, Fewer, or Fewest. 
More and Most will cause more words to be 
displayed for verification. Less or Least will 
show fewer words. Normal is the default. 


Some documents have words with characters 
that are not available on your keyboard 
(accents, symbols, and so on). To verify such 
characters, pull down the View menu and 
select Show Special Characters. This 
displays the special character keypad. To 
enter special characters in the Word edit box, 
select them from the keypad by clicking on 
them with the mouse (Figure 3-12). 
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Figure 3-12. Special character keypad 
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t= Occasionally, TextBridge mistakes noise 
(marks on the page) or a horizontal line as 
text. In the Verifier window, the Word edit box 
contains characters, while the image area 
shows the non-word highlighted. In such 
cases, delete all the text in the Word edit box, 
then click Accept. TextBridge ignores the 
noise, and proceeds to the next questionable 
word. 


Also, if the image of a particular word is poor 
in comparison with the rest of the document, 
you should correct the text without training 
TextBridge on the image. Simply correct the 
word in the Word edit box, then hold down the 
Control key and press the Enter key (or click 
Accept). The corrected text is output without 
TextBridge being trained on the image. 


The image area in the Verifier window is 
zoomed in to approximately the middle of the 
zoom range. If you want to get more of an idea 
of the page location of the word being verified, 
click the Zoom Out icon, then click inside the 
Verifier image area. The full page image 
appears in the image area of the Verifier 
window. Conversely, if you want to further 
magnify the display, use the Zoom In icon. 


6. Repeat step 5 until you verify enough words. 


t= Verify at least one page of a multiple-page 
document to teach the system about the entire 
document. 
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7. Close the Verifier. 


Pull down the Process menu and select the End 
Verification command. 


t= The Close command in the Verifier window's 
control menu is equivalent to the End 
Verification command. 


The Verifier window closes and OCR continues 
automatically. When you are finished processing 
all pages, TextBridge displays the Save As dialog 
(refer to Figure 3-4). 


8. Specify the output file information. 


e Specify the file name, destination disk and 
directory. 


e Also specify the output text format in the Save 
File as Type pop-up menu. 


e When done, click OK. 


TextBridge converts the recognized text and saves 
the converted text to a file of the given name, disk 
and directory location. The Main dialog remains, 
ready for the next job. 
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TIPS AND TECHNIQUES 


The first three chapters of this user’s guide focus on 
information that enables you to understand, install, 
and use the basic functions of TextBridge. 


This chapter describes methods to maximize 
TextBridge OCR results. Specifically, this chapter 
covers the following topics: 


e Getting the best text recognition 
e Making document processing more efficient 
e Saving page images 


e Running TextBridge OCR from other applications 


GETTING THE BEST TEXT RECOGNITION 


TextBridge OCR software achieves a consistently high 
level of character recognition accuracy over a wide 
range of documents. However, there are some actions 
you can take to help TextBridge do the best possible 
job of character recognition for a particular document. 


This section offers some suggestions for optimizing 
text recognition. It covers these topics: 


e Use and maintain your scanner properly 
e Adjust scanner brightness 

e Adjust for colors 

e Use the fax filter 

e Use the word verifier 

e Process multiple documents separately 


e Use the Invert command in Preview 
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Use and maintain your scanner properly 


How you use and maintain your scanner can make the 
difference between a successful and unsuccessful scan. 
Follow these tips: 


e Know your scanner. Read and understand all 
documentation that came with your scanner. 


e Maintain the scanner. Keep your scanner clean 
and dust-free. Keep your scanner’s glass platen 
(flatbed) free of dirt or marks that might be 
captured during scanning. 


e Load the scanner correctly. Make sure your 
document is not scanned at an angle. This makes 
character recognition more difficult. 


When using the document feeder, make sure 
paper guides are aligned properly for the pages 
you are scanning (Figure 4—1). 


If you are using the flatbed, make sure the page 
image is flush against the platen, and is straight. 
Sometimes the actual image is skewed relative to 
the paper it is printed on. Correct for this as much 
as possible before scanning. 


Adjust guides to 
fit document and 
minimize page 
skew 


Figure 4-1. Page placement in the scanner 
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Adjust scanner brightness 


During scanning, one of the most important settings 
affecting successful character recognition is scanner 
brightness. As Figure 4—2 illustrates, the original 
documents you scan may vary considerably. 


Darkness of text, the lightness of the background, and 
the amount of noise (dirt, smeared ink, fingerprints, 
handwriting, and other marks) on the page can all 
affect character recognition accuracy. 
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Figure 4—2. Document originals and brightness 


From the Scanner Settings dialog (Figure 4-3), you 
can adjust Brightness to compensate for the print 
quality and document background. You access 
Scanner Settings from the Preferences dialog. 


Try the Lighter brightness level if characters on your 
page appear too bold, are starting to fill in or are 
touching, or words are separated by very small spaces 
(as in some magazines). Recognition of documents 
with background noise, or with screened or colored 
backgrounds, can improve considerably by increasing 
the brightness setting. 
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Figure 4-3. Scanner brightness settings 


Try the Darker brightness level if characters on the 
page appear faint, broken, or very thin. 


If your scanner supports the Auto brightness setting, 
select this to achieve the best level of brightness for 
each page of a document. 


Another way to tell if brightness is adequate is to 
preview a page and zoom in to full resolution in the 
Preview window. This, in effect, lets you view the 
scanner output that the system “sees” (Figure 4—4). 
If the previewed image does not appear to have the 
proper brightness, you can adjust the setting and 
rescan the document. 


Preview 


Yiew Process Help 


Page is dark; 
raise brightness 
and rescan 


Figure 44. Preview display ean 
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Adjust for colors 


All scanners have one or more colors that they do not 
read. These are called drop-out colors. Refer to the 
documentation that came with your scanner to 
determine the drop-out color. 


ts If your scanner documentation does not mention 
the drop-out color, examine the color of the 
scanner light as is moves across the flatbed. The 
color of the light determines the drop-out color. 
Many scanners have a light green scanner light, 
for example; thus the drop-out color would be light 
green. 


In addition to drop-out colors, there may be other 
colors with which your scanner has difficulty. If the 
text (or image) you are scanning is colored, or is 
printed on a colored background, you can try 
adjusting the brightness setting. 


If that does not work, try photocopying the page and 
scanning the black and white copy. 


Use the fax filter 


Fax images are — 


low resolution 
and often are 
skewed 


One application of TextBridge is to recognize the text 
in fax images. Fax images, typically, are low resolu- 
tion (100-by-200, 200-by-100, or 200-by-200 dots per 
inch). Even so-called “fine resolution” faxes at 200-by- 
200 dpi are often only marginally legible (Figure 4—5). 
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Figure 4-5. Fax image 
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Note 


To recognize fax images, TextBridge provides a Fax 
setting in the Preferences dialog. This Document 
Quality filter initiates a pre-processing step that 
enhances the fax image before OCR begins. 


The Fax switch works on fax images stored in TIFF 
files and hard copy faxes scanned at higher resolutions 
(for example, 300 dpi). 


Do not use the Fax filter on non-fax documents, either 


scanned or on-line. If you do, OCR accuracy may 
degrade. Also, if you notice that recognition is poor on 
synthesized fax images (for example, a word processor 
document “printed” to a fax modem), turn off the Fax 


filter, and try verifying part of the text during OCR. 


Process multiple documents separately 
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TextBridge OCR software uses a variety of artificial 
intelligence techniques to recognize text. 


With those techniques, TextBridge actually teaches 
itself about what it is recognizing, 


Thus, TextBridge can improve OCR accuracy and 
speed as it scans and recognizes subsequent pages of a 
document. 


However, you can compromise this learning capability 
by processing pages of different documents to the 
same output file. 


TextBridge expects the second and successive pages of 
a document to use the same fonts it recognized on the 
first page (Figure 4-6). 
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Figure 4-6. Processing multiple documents 


If the second page is a totally different document, with 
different typefaces and point sizes, the knowledge that 
TextBridge gained for the first page becomes invalid. 


TextBridge must begin the learning process over 
again for the second (and successive) pages. 


If you want to scan multiple documents, and get the 


best recognition results, scan each document as a 
separate job. 
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Use the word verifier 


Verify text 
for highest 
accuracy 
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If you find that TextBridge is giving less than satis- 
factory results on a particular document, use the word 
verifier to improve recognition accuracy. 


By interacting with the OCR process in the Verifier 
window (Figure 4—7), you teach TextBridge about the 
characters and words in the document. This can 
significantly improve recognition accuracy. 


Verifier 
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Figure 4—7. Verifier window 


Each word that TextBridge is unable to recognize or is 
unsure about appears in the Word edit box at the top 
of the Verifier window. The image of the word is 
highlighted below for context. 


With the Verifier, you can move through the 
recognized text and accept or correct TextBridge’s 
recognition decisions. Your input helps TextBridge 
improve recognition as the job progresses. 


Generally, in a multiple-page document, verify one or 
two pages, then end verification. TextBridge will use 
your input to make better recognition decisions for the 
rest of the document. 


On small (one- or two-page) documents, to attain the 
highest recognition accuracy, you can verify the entire 
document. 


For more information about using the word verifier 
and all its options, refer to Chapter 3. 
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Use the Invert command in Preview 


Invert reversed 
images before 
OCR 


TextBridge is capable of recognizing on-line TIFF files 
that originate from fax modems or other sources. 


Occasionally, image data is saved so that the picture 
elements (pixels) in the resulting file are reversed: 
the white page background is black and the print on 
the page is white. This is often true with Intel 
FAXability files, for example. 


TextBridge cannot recognize such files. For recog- 
nition of an on-line TIFF file, it is critical that the 
image contain black type on a white background. 


To enable recognition of files with reverse images, 
TextBridge provides the Invert command in the 
Preview window’s View menu (Figure 4—8). 
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Figure 4-8. Inverting a document 


If you are unsure whether an on-line file is reversed, 
display it in the Preview window before starting OCR. 
If it shows up with white type on a black background, 
pull down the View menu and click the Invert 
command. TextBridge reverses the image. 


Then you can begin the OCR process. Note that 
inversion must be manually corrected for each TIFF 
file that is stored this way. 


For a procedure to use Preview, refer to Chapter 3. 
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TIPS FOR EFFICIENT PROCESSING 


When you first use TextBridge, you may find it easiest 
to scan a document without adjusting the default 
preferences. In many cases, using default preferences 
provides good results. 


However, if you want to get the best performance from 
TextBridge, there are a few measures you can take 
before starting OCR: 


e use the zone tool in Preview 
e use the Ignore Photos/Halftones setting 
e use auto-orientation when appropriate 


e use auto-segmentation for multi-column 
documents 


These features help to assure that the system 
processes only the parts of a page that are essential, 
and processes them correctly. 


Over the course of an entire document, or many 
documents, using these features can translate to 
valuable time savings. 


Zone to capture only the data you want 
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Some documents may display logos, graphics, running 
headers and footers, and other matter that you do not 
need to capture, and could otherwise slow down the 
recognition process. 


With the zoning tool in the Preview window, you can 
identify just that portion of the page(s) that you want 
to capture (Figure 4—9). 


Refer to Chapter 3 for information about using 
preview tools. 
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Figure 4-9. Zone in Preview 


Use the Ignore Photos/Halftones option 


On a printed document, a halftone photograph is 
made up of different-sized black dots. Ordinarily, 
TextBridge would spend some time trying to recognize 
the halftone dots as text. 


Eventually, TextBridge would conclude that it was 
trying to recognize a halftone and would then ignore 
it. However, to speed up text recognition on docu- 
ments that also contain halftones, you can turn on the 
Ignore Photos/Halftones setting before OCR. 
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TextBridge 


With the Ignore Photos/Halftones option on, 
TextBridge quickly scans the page image and masks 
out halftones before beginning character recognition 
(Figure 4-10). Thus, actual character recognition is 
faster and more efficient. 
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Figure 4-10. Ignore Photos/Halftones filter 


To use the Ignore Photos/Halftones option, select the 
Preferences button from the Main dialog. In the 
Preferences dialog, click the Ignore Photos/Halftones 
checkbox on. 


ts Although the Ignore Photos/Halftone filtering step 
is relatively quick, do not specify it if your 
document does not contain halftones. 


Use auto page orientation 
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TextBridge provides a tool that automatically 
determines the orientation of a page, rotates it in 
memory if necessary, then begins OCR (Figure 4—11). 


Specify Auto Page Orientation in the Preferences 
dialog, which you access from the Main dialog. This 
feature is useful in certain circumstances, for 
example: 
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Figure 4-11. How auto page orientation works 


e when you are processing documents with pages of 
mixed orientation 


e when you are processing a TIFF file, and you do 
not know the orientation of the image it contains 


In the first instance, you could be processing a 
document that has mostly portrait pages mixed with 
several landscape pages. 


TextBridge scans each page, determines whether it is 
portrait, landscape (90-degrees or 270-degrees), or 
upside-down, and rotates it to portrait (0-degrees) 
before beginning OCR. 


In the second instance, if the TIFF image you are 
about to recognize is sideways or upside-down, Text- 
Bridge will rotate it appropriately, then recognize it. 


t= Auto-orientation is a processing stage that 
happens before recognition. Therefore, to achieve 
the fastest OCR, use auto-orientation only when 
the circumstances require it. 
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Use auto page segmentation 


TextBridge provides a tool that automatically locates 
regions of text on the page, defines their order, then 
begins OCR. This auto page segmentation feature 
is critical for recognition of pages that have more than 
one column and/or unusual layouts (Figure 4—12). 
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Figure 4-12. Page segmentation and region ordering 


Important You must turn auto-segmentation on if you are 
processing pages that have more than one column of 
text. Otherwise, TextBridge can output regions of 
recognized text in the wrong order. Do not use auto- 
segmentation on single-column documents. 


Auto-segmentation is a pre-processing stage that 
occurs before OCR begins. 


Note that you can create a zone in Preview and still 
use auto-segmentation. Auto-segmentation will work 
inside the zone only. So, for example, if you draw a 
zone around two columns of a three-column layout, 
auto-segmentation will detect and order the two 
columns in preparation for the OCR process. 


4-14 TextBridge User’s Guide 


SAVING PAGE IMAGES 


One of the checkbox options on the main dialog is 
Save Page Images. This option, available when you 
set Input From Scanner, enables you to save a binary 
(black and white) image of each page scanned during 
a TextBridge OCR session. 


Note TextBridge saves page images as TIFF files with 
CCITT Group 3 compression. Group 3 is a compres- 
sion standard specified by the CCITT (Consultative 
Committee of International Telephone and Tele- 
graph), an international standards organization. 


After you click GO!, and the first page is scanned, 
TextBridge displays the Save Page Images As dialog 
(Figure 4-13). 


= Save Page Image As 
File Name: Directories: 


Type base name——| {Emer c:\tiff 
for page images 


content. tif 
cover.tif 
cpyright. tif 
guid0001 tif 
guid0002. tif 


guid0005.tif 


Save File as Type: Drives: 


[TIFF CCITT-3 (*. TIF) +| c: +] 


Figure 4-13. Save Page Image As dialog 


The Save Page Image As dialog is very similar to a 
Windows standard Save As dialog, in that it allows 
you to specify a file name, file type, output directory 
and disk drive. 


The file name is the base name on which the page 
image file names are built. It is also the document 
name that appears by default in the TextBridge Save 
As dialog when OCR is completed. The default file 
name is untitled with the .tif extension. 
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For example, suppose you specified the name “guide” 
in the Save Page Image As dialog. Page image files 
would be stored with a name in the format: 


guidnnnn.TIiF 


where nnnn is the document page number with 
leading zeroes (for example, 0001, 0002, and so on). 


Page image files are named sequentially within the 
directory. If a file of the same name (for example, 
“guid0001.tif”) already exists, TextBridge will 
start with the next number in sequence. 


Also, at the end of the job, the document name (for 
example, “guide”) automatically appears in the File 
Name box in the normal Save As dialog. 


In the Save Page Image As dialog, the initial working 
directory is the directory from which you launch 
TextBridge: 


C:\TXBRIDGE\BIN 


However, you can specify any other disk drive and 
directory in which to store the page images. This 
becomes the new working directory for the job, and, 
like the document name, will also be in place in the 
Save As dialog when OCR is completed. 


For page image file format, the Save File As Type 
menu provides only one selection, TIFF CCITT-3 
Intel. Page images are saved exactly as scanned in 
binary (black and white) format. 


Note that if you click Cancel in the Save Page Image 
As dialog, the dialog closes and TextBridge terminates 
the job. The Main dialog remains ready for you to 
start again. (For example, if you do not want to save 
page images, you can click the Save Page Image 
checkbox off and re-start the job.) 
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RUNNING TEXTBRIDGE FROM OTHER APPLICATIONS 


TextBridge OCR for Windows is actually a suite of 
applications that enables you to run OCR from within 
virtually any other Windows application. 


In addition to the main utility, which runs as a 
standalone program, and has the widest feature set, 
TextBridge OCR is provided in two other forms: 


e TextBridge Application Server, a program 
that acts as a menu item from inside virtually any 
registered Windows text application (word 
processor, desktop publishing program, 
spreadsheet, database application, and so on). 


e TextBridge OCR Printer, a capability that 
enables you to send an image in any format to a 
version of TextBridge OCR that works like a 
conventional print driver. 


This section provides information about using 
TextBridge OCR in these forms. 


Note TextBridge also supports a DDE interface. Interested 
developers and system integrators should call Xerox 
Imaging Systems Customer Support for details. 


Use the TextBridge Application Server 


The TextBridge Application Server (TAS) is a 
Windows program that can be “attached” to, and thus 
run from within, other Windows text applications. 


Once attached, TAS appears in the host application’s 
File menu as the TextBridge OCR command. When 
you select TextBridge OCR, the TextBridge main 
dialog appears as if it were a dialog of the host 
application. From here, you can set up and initiate 
OCR exactly as you would with the standard 
TextBridge program. 
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Starting TAS and registering applications 


TAS is installed during the TextBridge installation 
process described in Chapter 2 of this manual. 


At the end of the installation process, the TextBridge 
setup program creates a TextBridge OCR program 
group that includes the TextBridge Application Server 
program item (Figure 4-14). 


TextBridge OCR 


Scanner Setup Release Notes 


TextBridge 
Application 
Server 


Figure 4-14. TAS program item 


Before you can run TAS from within an application, 
you must start the program, and you may have to 
register the application, as well: 


1. Double-click the TAS program item in the 
TextBridge OCR program group. 


The program starts and appears as a minimized 
icon on your Windows desktop. 


2. Double-click the icon on the Windows 
desktop. 


The TAS registration dialog appears (Figure 
4-15). 


TextBridge User’s Guide 


Select the 
unregistered 
application 


Then click Add 


Note 


a TextBridge Application Server/Register - TBMENU.INI 
Available Applications 
Microsoft Notepad 
Cancel ] 
Help J 
‘WordPerfect 
Registered Applications 
Microsoft Notepad 
Microsoft Word for Windows Add ] 
Microsoft Write 
Add All ] 
| Delete J 


Figure 4-15. TAS registration dialog 


3. Register the application, if necessary. 


e At the top of the registration dialog, highlight 
the application that you want to register. 


e Click the Add button to add the application to 
the Registered Applications list at the bottom 
of the dialog. 


e When you are done registering your 
application(s), click OK. 


You can now go on to use TextBridge OCR from 
within your registered application. 


Running TAS from within your application 


In the File menu of any active registered application, 
the TextBridge OCR command appears as the last 
command directly above the Exit command. 


For TAS to work, the host application must have a 
File menu, and in the File menu, an Exit command. A 


majority of Windows applications use this standard. 


As an example, Figure 4-16 shows the TextBridge 
OCR command in the WordPerfect® File menu. 
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WordPerfect - [Documentl - unmodified] 

Edit View Layout Tools Font Graphics Macro| 
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Password... 


Find Files... 


Preferences > 


Print... F5 
Print Preview... Shift+F5 
Select Printer... 


TextBridge OCR 


Exit Alt+F4 _| 


| 
| 
| 
| 
| 
| 
| 
| 
| 
File Manager... 
| 
| 
| 
| 
| 
| 
| 
| 
| 


Figure 4-16. TextBridge OCR command 


To run TAS, and import recognized text directly into 
the host application’s open document, use the 
following procedure. 


1. 


Start the TextBridge Application Server. 


Double-click the program item in the TextBridge 
OCR program group (refer to Figure 4-14). 


t= To have TAS start automatically whenever 
you start Windows, place it in the StartUp 


program group. 
Make sure the host application is registered. 


Refer to the procedure in the previous section, 
“Starting TAS and registering applications.” 


Start the host application. 


With the host application, open a new or existing 
document into which you want to import 
recognized text. 
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4, Pull down the host application’s File menu 
and select the TextBridge OCR command. 


Status messages appear: 
Connecting to TextBridge Services... 


Connection with TextBridge Services 
established. 


In a few moments, the TextBridge main dialog 
appears. 


5. Set up and initiate OCR from the main 
dialog. 


With the main dialog displayed, you can choose 
either File or Scanner as the input source, specify 
Preferences, and proceed exactly as if you were 
using TextBridge as a standalone application. 


t= The Preview, Verify, and Save Page Images 
capabilities are not available in the TAS 
version of TextBridge. If you require these 
capabilities, run TextBridge as a standalone 
application, and save recognized text in your 
word processing or other text format. 


Refer to Chapter 3 for step-by-step procedures for 
using TextBridge; refer to earlier sections of this 
chapter for usage tips and techniques. 


When OCR is complete, TAS closes, and recog- 
nized text appears at the cursor position in your 
application’s open document ready for editing. 


t= TAS uses the Windows clipboard to cut and 
paste recognized text to your application 
either as formatted RTF (Rich Text Format) 
or as plain ASCII text. If your application 
supports RTF pasted from the clipboard, then 
RIF is used. If not, recognized text is pasted 
as plain text, and the formatting (bold, italic, 
and so on) is lost. 
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In its other forms, TextBridge OCR for Windows can 
recognize image files only if they are stored in TIFF 
format. Certain applications, particularly some 
facsimile (fax) programs, store fax page images only in 
PCX, DCX, or some other proprietary format. 


To run OCR on non-TIFF images, you can use the 
TextBridge OCR printer. The OCR Printer is 
designed to appear in Windows applications as just 
another target printer. It enables you to “print” an 
image from a Windows application and produce a 
recognized and formatted text file as the result. 


A typical use of the OCR Printer is to OCR a page 
image directly from a fax or imaging application. The 
OCR printer is similar to the model of many fax 
programs that use a similar feature to send faxes. 
That is, the fax image is “printed” to the fax modem 
and sent to another fax modem or fax machine. 


The OCR printer offers the added advantage of being 
able to recognize virtually any image format (DCX, 
PCX, Corel, TIFF, and so on). Virtually any Windows 
program designed to handle images can make use of 
the OCR printer. 


To prepare the OCR printer for use, refer to the 
following subsection, “Adding the OCR printer.” Then 
refer to “Using the OCR printer in your imaging 
application” for instructions to download an image 
and produce a text file. 


Adding the OCR printer 


The OCR Printer program files are installed along 
with the main TextBridge application. 


However, as with actual printer drivers, you must add 
the OCR Printer to the list of printers available to 
Windows applications on your PC. 
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Click to show 
list of printers 


This procedure assumes that you have already 
installed TextBridge as described in Chapter 2. 


1. From the Windows Program Manager, open 
the Main program group and double-click 
the Control Panel icon: 


Control Panel 


This opens the Control Panel window with icons 
for various parts of the system. 


2. Double-click the Printers icon in the Control 
Panel window. 


This opens the Windows Printers dialog box 
(Figure 4-17). 


[ Default Printer 
Apple LaserWriter Plus on LPT1: 


[ Installed Printers: 


Set As Default Printer 


J Use Print Manager 


Figure 4-17. Printers dialog 


3. Click the Add button in the Printers dialog. 


The dialog box expands to show the List of 
Printers that can be added. 
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4. Highlight the following item in the list, then 
click the Install button. 


Install Unlisted or Updated Printer 


This action displays an Install Drivers dialog box 
in which you are instructed to specify the drive 
and directory location of the printer driver. 


5. In the Install Drivers dialog box, enter the 
TextBridge BIN directory pathname: 


c:\txbridge\bin 


6. Click OK (or press Enter). 


This displays the Add Updated or Unlisted Printer 
dialog (Figure 4-18). 


= Add Unlisted or Updated Printer 


List of Printers: 


TextBridge OCR Printer | +f 
Click to add the 


OCR printer 


Figure 4-18. Add Updated or Unlisted Printer dialog 


7. Select the OCR Printer entry and click OK. 


The Add Updated or Unlisted Printer dialog 
closes, leaving open the Printers dialog. 


8. Click Close in the Printers dialog to end the 
Add process. 


You can now go on to use the TextBridge OCR 
Printer as described in the next subsection. 
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Using the OCR printer in your imaging application 


You can use the OCR Printer with any Windows 
application that can open and view an image file. 


For example, WinFax Pro (from Delrina Technology 
Inc.) provides an Image Viewer to view, manipulate, 
and print fax images. Using the Print command in the 
File menu of WinFax Image Viewer, you could specify 
and use the OCR Printer to perform character 
recognition on the fax image. 


The OCR Printer enables you to access TextBridge 
Preferences before beginning recognition. After 
recognition is complete, you can specify the output 
text file name, location, and format in the standard 
TextBridge Save As dialog. 


To use the OCR Printer in your application, follow 
these steps: 


1. Open the imaging application, and display 
the image to be recognized. 


ts The image must be binary (black and white) 
and within the accepted range of resolutions 
supported by TextBridge. TextBridge can 
recognize images of 100-by-200, 200-by-100, 
200-by-200, 300-by-300, and 400-by-400 dots 
per inch. 


2. In your application's Print Setup dialog, 
specify the TextBridge OCR Printer as the 
destination printer. 


You should have by now already added the OCR 
Printer in the Windows Control Panel Printer 
program, as described in the previous subsection, 
“Adding the OCR print driver.” 
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3. Optionally, define TextBridge preferences. 


e Click the Options or Setup button in your 
imaging application’s Print Setup dialog. This 
displays a secondary Setup dialog. 


e Click the Preferences button in the Setup 
dialog to display the TextBridge Preferences 
dialog. 


e Specify standard or fax document quality, 
page orientation, auto page segmentation, and 
so on. 


e When you are done, click OK in the 
Preferences dialog and move up out of the 
other dialogs, as well. 


t= In the Print Setup dialog of your applica- 
tion, if there is an option to use the actual 
printer resolution, turn it on. 


4, Start the OCR process on the displayed 
image. 


e From your application, pull down the File 
menu and select Print. 


e Click OK in the Print dialog. Processing 
messages now appear as OCR proceeds: 


Processing... 
Acquiring Image... 


Recognizing text... 


When recognition is complete, the TextBridge 
Save As dialog appears (Figure 4-19). 
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File Name: Directories: 
Specify the py untitled. san c:\txbridge\bin 
output file name «CS 


cs 
and directory 


 txbridge 


Specify the List Files of Type: Drives: 


output format ———[Ami Pro 3.0 (SAM) +) [ec 


Figure 4-19. Save As dialog 


5. Specify the output file name, format, drive 
and directory destination, then click OK. 


The recognized text file is converted to the 
specified format and written to your hard disk. 
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TROUBLESHOOTING AND 
ERROR CORRECTION 


TextBridge is designed to be easy to install and use, 
and, under typical circumstances, you should rarely 
experience problems. 


However, should you encounter a problem during 
installation and/or use of TextBridge, first consult this 
appendix to try to resolve the problem yourself. 


TextBridge error messages appear in a standard 
Windows error dialog box, as shown in Figure A-1. 


= TextBridge 


STOP) Invalid input format. 
Error Code: 673 


Figure A-1. Error message example 


For information to resolve an error condition, refer to 
the appropriate section in this appendix. This 
appendix is organized in three sections: 


e What to do if you encounter a problem 
e Troubleshooting common problems 


e Correcting error conditions 
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WHAT TO DO IF YOU ENCOUNTER A PROBLEM 


If you are a new or inexperienced user, and you 
encounter a problem, first refer to “Troubleshooting 
Common Problems,” the next section in this appendix. 
That section suggests solutions to common problems 
found by TextBridge users. 


If you are a more experienced TextBridge user and 
you encounter an error, refer to the “Correcting Error 
Conditions” section to locate the error, and follow the 
recommended solution. 


When you get an error message, write down the text 
of the message, along with the error code number. 


Also, note the sequence of steps you took to generate 
the message. This information can be useful later if 
you cannot solve the problem and must call your 
scanner manufacturer for support. 


If you get an error message that you cannot locate in 

this appendix, and/or you cannot resolve a problem on 

your own, contact your scanner manufacturer. 

If you should need to call, be ready to provide: 

e your software registration number (the serial 
number on Disk 1 of the original TextBridge 
installation diskettes) 


e alist of the steps that led up to the problem 


e averbatim description of the error message 
(and/or number) 
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TROUBLESHOOTING COMMON PROBLEMS 


This section describes typical problems with 
TextBridge, and provides suggestions to resolve them. 
It also discusses other issues. Specifically, this section 
is organized into four topics: 


e ISIS scanner problems 
e TWAIN scanner problems 
e Virtual memory problems 


e Other problems 


ISIS scanner problems 


TextBridge provides a number of ISIS scanner drivers 
developed by Pixel Translations and other sources. 


Following are some common error messages relating 
to ISIS scanner setup and use, and suggestions to 
correct the error conditions: 


Can’t open system-level scanner driver; check 
installation 


In Scanner Setup, after using the Select Source 
command, you run the Acquire command to test 
the scanner, and this message appears. 


Assuming you have correctly installed the scanner 
interface card and connected and powered on the 
scanner, do the following: Load the scanner 
system-level driver (.sys file) file onto your PC, 
reference the complete file pathname in a device 
statement in your config.sys file, then restart 
your PC. 


The system-level driver, and instructions to install 
it, should be provided by the interface card or 
scanner manufacturer. 
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ISIS error or Cannot write to device USCAN.XXxX 


In Scanner Setup, when you test the scanner with 
Acquire, one of these messages can appear. 


If you have an Envision scanner, install the ISIS 
driver provided by Envision Systems, Inc. 


Choose Select Source, insert the Envision diskette 
into the disk drive, and type the directory path: 


drive: \txbridge\6100 


In the Add dialog, select the ISIS driver and 
proceed accordingly. 


If you encounter one of these errors with another 
scanner, the TextBridge ISIS driver could be out- 
dated. 


Call the scanner manufacturer to see if an 
updated ISIS driver is available. 


If not, call Customer Support. 


Also, these errors can be generated by an address 
conflict with another device. 


Try changing the memory address of your scanner 
card according to manufacturer instructions. 


Finally, these error messages can be generated by 
an extended memory manager, such as EMM386, 

that allocates your scanner card memory address 

to another device. 


In that case, you need to exclude your scanner 
card’s memory address in the EMM386 statement 
in your config.sys file. 
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TWAIN scanner problems 


TWAIN is an emerging industry standard for the 
development of scanner and other image-capture 
device drivers. 


TextBridge supports fully-TWAIN compliant hand- 
held scanners and other devices. 


For your TWAIN scanner to work with TextBridge, it 
must have the following software: 


TWAIN source manager (TWAIN .DLL)—This 
software manages the communication between 
your scanner’s TWAIN source driver and 
TextBridge. It is provided by your scanner 
manufacturer and must be loaded into the 
Windows directory, typically C: \WINDOWS. 


TWAIN source driver—This is the actual 
scanner driver. It is provided by your scanner 
manufacturer and typically is loaded in 
C:\WINDOWS\TWAIN, or a subdirectory of this 
directory path. 


This section describes some of the problems that you 
can encounter with a TWAIN scanner while using it 
with TextBridge, specifically: 
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Problems with scanner setup 
Problems with buffered memory 


TWAIN errors 


Problems with Scanner Setup 


The Scanner Setup program links your scanner’s 
TWAIN source driver with TextBridge. 


If your scanner’s TWAIN software is not properly 
installed, you can encounter one of the following 
problems: 


Under the Type menu, TWAIN is dimmed 


TextBridge cannot find the TWAIN source 
manager (TWAIN.DLL) in the Windows directory, 
or there is no TWAIN subdirectory in the Windows 
directory. 


Check to see that a file named TWAIN.DLL resides 
in the C: \WINDOWS directory. Check also to see if 
there is a C: \WINDOWS\ TWAIN subdirectory. 


If either of these conditions is untrue, repeat all 
TWAIN installation steps described in your 
scanner documentation. Verify the existence of the 
TWAIN source manager and the TWAIN source 
subdirectory, as above. Restart your PC. Then 
try running Scanner Setup again. 


If the TWAIN type is still dimmed in the Scanner 
Setup Type menu, call Customer Support. 


No TWAIN sources installed 


Under the Type menu, you have selected TWAIN. 
When you run the Select Source command under 
the File menu, this message is displayed. This 
means that TextBridge cannot locate any TWAIN 
drivers in the C: \WINDOWS\TWAIN subdirectory. 


Repeat all TWAIN scanner installation steps as 
instructed by your scanner documentation. 
Restart your PC. Then try running Scanner 
Setup again. 


If you continue to get this message, call Customer 
Support. 
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No TWAIN source selected 


When you run the Acquire command under the 
File menu, this error message is displayed. 


Run Select Source first, select a TWAIN source 
driver, then run the Acquire command again. 


Problems with buffered memory mode 


To support TWAIN on 4Mb systems, TextBridge 
requests the TWAIN source driver to use buffered 
memory mode. 


If the TWAIN source driver correctly supports 
buffered memory mode, it uses no more than 64 
kilobytes (Kb) of memory at a time, passing the 
scanned image to TextBridge in segments. 


TextBridge then copies these segments into the 
memory it has set aside to store the page image it is 
about to recognize. 


Ideally, buffered memory mode reduces the total 
amount of memory the TWAIN source driver and 
TextBridge use to manage the scanned image. 


Some TWAIN source drivers do not properly support 
buffered memory mode, and have problems delivering 
a clean image to TextBridge. 


In such cases, the image tends to be severely slanted 
or otherwise garbled, and TextBridge cannot perform 
legible OCR on it. 


If you encounter this problem, you can direct 
TextBridge to request native memory mode. 


In native memory mode, the TWAIN source driver 
allocates enough memory to store the entire page 
image before it passes it to TextBridge. 
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Note 


To specify native memory mode: 
1. Edit the TextBridge initialization file: 
c:\windows\txbridge.ini 


2. At the bottom of the file, insert the following 
line exactly as shown. 


memory=native 
3. Save the changes to the file. 


4. Quit TextBridge and start it again, then try 
scanning with your TWAIN scanner. 


Native memory mode requires 1Mb of memory. If your 


PC is a 4Mb machine, you can encounter memory 
problems running in native memory mode. In this 


case, you should upgrade your PC to 8Mb of RAM. 


TWAIN source driver errors 


The TWAIN standard is in its infancy. Developers of 
TWAIN source drivers fine-tune them so that the 
scanners work with a particular application. 
Typically, the scanner, source driver, and application 
are sold as a bundle, and they all work fine together. 


However, if you get an error from the TWAIN source 
driver while using your device with TextBridge, it 
could be that the source driver is not fully-TWAIN 
compliant. 


Contact the manufacturer to see if an updated 
TWAIN source driver is available for your device. If 
not, call Customer Support. 
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Virtual memory problems 


Note 


Some problems in using TextBridge are related to not 
allocating enough virtual memory in Windows. 


Because TextBridge must process large image files 
during OCR, the program requires a minimum of 
eight megabytes (8Mb), and preferably 12 to 16Mb, of 
permanent virtual memory, especially on PCs that 
only have 4Mb of RAM (random access memory). 


Permanent virtual memory is a contiguous block of 
swap space on your hard drive. It cannot be located on 
a compressed drive. 


On systems with 8Mb of RAM, it is possible, although 


not recommended, to run TextBridge using 
temporary (non-contiguous) virtual memory. 


Following are a few examples of problems that can be 
related to virtual memory: 


e error message “Invalid index for language” 


t= This error can also be caused for other 
reasons; see the section, “Other problems.” 


e error message “General protection fault” 
e TextBridge hangs while acquiring the image 


e the scanner stops during a scan 


If you have installed TextBridge and experience 
problems while using it, check and, if necessary, 
change your virtual memory setting in Windows. Use 
the following procedure: 
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1. From the Program Manager, open the Main 
program group, and select Control Panel. 


2. Double-click the 386 Enhanced icon. 


3. Click the Virtual Memory button to display 
the current swap file size and type 
(permanent or temporary). 


If virtual memory is less than 8Mb (preferably 
16Mb) in size, or is temporary, you may need to 
change it, as described in the next step. 


4, Change virtual memory. 


e Click the Change button to display Maximum 
Size, Recommended Size (not in DOS 6.0), and 
New Size values. 


e IfType is not Permanent, specify Permanent. 
This is required on 4Mb systems, recom- 
mended on 8Mb systems. 


e Using the now-displayed Recommended Size 
value (not on DOS 6.0), enter this value as the 
New Size, or simply accept the New Size 
amount. 


t= If there is not enough contiguous space to 
create a large enough permanent swap file 
(8-16Mb), you will need to defragment 
your disk. Use a utility such as Norton 
Utilities’ SpeedDisk to perform this 
operation. 


With the appropriate virtual memory set up on your 
Windows-based PC, you should be able to use 
TextBridge successfully. 


If you still encounter any of the problems listed above, 


or problems that you cannot otherwise resolve, contact 
TextBridge Customer Support. 


TextBridge User’s Guide 


Other problems during TextBridge operation 


Following are some other common problems that 
TextBridge users have encountered: 


Invalid index for language 


If your PC has 4Mb of physical RAM (random 
access memory), and you get this message while 
using TextBridge, you are probably running DOS 
Version 6 with Memmaker, or have a number of 
devices being loaded into high memory. 


If you are using Memmaker, you will have to undo 
it, as TextBridge requires the space in upper 
memory that Memmaker allocates to files. Close 
out of Windows, back up your autoexec.bat and 
config.sys files, then, at the DOS prompt, type: 


Memmaker /undo 


If you are not using Memmaker, and you get this 
error, it may be because you are using too many 
LH (load high) statements in your autoexec.bat 
file, or DH (device high) statements in your 
config.sys file. 


Back up these files, then try reducing the number 
of LH and/or DH statements in these files. Restart 
your system and try using TextBridge again. 


If you still get the error message while running 
TextBridge, it could be a problem with virtual 
memory. Refer to the previous section, “Virtual 
memory problems,” for information. 


Troubleshooting and Error Correction A-I1 


Errors 667, 673, 675, or 690 


If when reading a TIFF file, you get any of these 
errors, the TIFF file cannot be processed. Text- 
Bridge can process binary (black and white) TIFF 
files of the following resolutions and formats: 


Resolutions Formats 


100-by-200 TIFF Uncompressed (Intel header) 

200-by-100 TIFF CCITT-3 (Intel header) 

200-by-200 TIFF CCITT-4 (Intel header) 

300-by-300 TIFF Uncompressed (Motorola header) 

400-by-200 TIFF CCITT-3 (Motorola header) 

400-by-400 TIFF CCITT-4 (Motorola header) 
TIFF (Intel FAXability header) 


In addition, the TIFF image must contain black 
type on a white background. Some fax programs 
save images in reverse (white type on a black 
background). TextBridge cannot recognize such 
files. Try processing these files using the Preview 
option in TextBridge. If a page image appears in 
reverse, use the Invert command under the View 
menu to correct the image. Then try processing 
the page again. 


CORRECTING ERROR CONDITIONS 


Note 


Occasionally, during TextBridge operation, you may 
receive an error message. TextBridge error messages 
are designed to be self-explanatory. Usually, you can 
simply correct the situation and proceed. 


However, if you require more detail about how to 
correct an error condition, consult this section. Each 
error message is listed here, along with a description 
of the cause and a recommended course of action. 


If you encounter an error message not described in 
this section, and you cannot resolve the problem on 
your own, contact Customer Support. 
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Canceling. Page too complex for selected mode. 


You are trying to process a complex page with 
halftones, or which is improperly oriented, 
without specifying the correct preferences for 
TextBridge to operate. 


In Preferences, specify Auto Page Segmentation 
and/or Auto Page Orientation, and try again. 


If you still encounter problems with a particular 
document, contact Customer Support. 


Cannot find file filename (or one of its 
components). Check to ensure the path and 
filename are correct and that all required 
libraries are available. 


You are trying to launch TextBridge, or open one 
of the files in its program group, and the program 
or file represented by filename cannot be found. 


Re-install TextBridge from the original 
installation diskettes. Refer to Chapter 2 of this 
manual for information. 


Cannot find this file. Please verify that the 
correct path and filename are given. 


This indicates that a file that appears in the Open 
dialog box was recently deleted, while the Open 
dialog file listing itself was not updated. This 
could happen if, for example, the file you were 
trying to access was on a network and another 
network user deleted or moved it. 


Try clicking GO! again to access the Open dialog. 
The file should no longer be listed. If it is listed, 
and you select it again, and you still get this 
message, your disk may be corrupted, or you may 
have network problems. 
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Can’t Initialize Server. 


You have started TextBridge, and the recognition 
server, named ICRSRVR.EXE, in the TextBridge 
BIN directory is corrupted or missing. 


Re-install TextBridge from the original 
installation diskettes. Refer to Chapter 2 of this 
manual for information. 


Cannot open Helhp file. 


The file named TXBRIDGE.HLP has been removed 
from the TextBridge BIN directory or is damaged. 


Re-install TextBridge from the original 
installation diskettes. 


File Error Cannot find filename 


You are trying to launch TextBridge, or open one 
of the files in its program group, and the program 
or file represented by filename cannot be found. 


Re-install TextBridge from the original 
installation diskettes. Refer to Chapter 2 of this 
manual for information. 


Invalid input format 


You have directed TextBridge to open a file that 
has the . TIF extension but is not a valid TIFF 
file. Although TIFF is an industry-standard, some 
applications write non-standard variations of the 
TIFF format. TextBridge can read the following 
TIFF variations: 


TIFF Uncompressed (Intel header) 
TIFF CCITT-3 (Intel header) 

TIFF CCITT-4 (Intel header) 

TIFF Uncompressed (Motorola header) 
TIFF CCITT-3 (Motorola header) 
TIFF CCITT-4 (Motorola header) 
TIFF (Intel FAXability header) 
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Must select a different language 


In Preferences, you have selected a recognition 
language that is not loaded on your system. 


Run the TextBridge SETUP program to re-install 
TextBridge, making sure to include all the 
recognition languages you intend to use. Refer to 
Chapter 2 of this manual for information. 


If you did install the language pack, and 
experience this problem, the language pack file in 
the TextBridge BIN directory has been deleted 
inadvertently, renamed, or corrupted. 


Re-install TextBridge from the original 
installation diskettes. Refer to Chapter 2 of this 
manual for information. 


Parameter combination not supported 


This error is usually the result of trying to have 
TextBridge process a TIFF file that is inappro- 
priate for OCR. For example, the TIFF file could 
be extremely low resolution (lower than 100-by- 
200 dpi), could be color or grayscale, or some 
combination of the two. 


TextBridge can only process binary (black and 
white) TIFF images with resolutions greater than 
100-by 200 dpi. 


Scanner not operational 


A number of conditions can cause this problem. 


Make sure you have followed the manufacturer's 
recommended instructions for installing the 
scanner on your PC, including installing the 
system-level driver. 


Make sure you have followed all the scanner 
installation steps described in Chapter 2 of this 
manual. 
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If you still get this message, your scanner may be 
powered off. Turn it on and try again. 

This can also happen if TextBridge is running and 
your scanner is powered off and back on, or 
otherwise loses power for a brief moment. Try 
exiting from TextBridge and starting it again. 


Otherwise, your scanner may be improperly 
connected. Power down your PC, check all 
connections. Then try again. 


Server not found or down. 


You have started up TextBridge, and the 
recognition server, named ICRSRVR.EXE, in the 
TextBridge BIN directory is corrupted or missing. 


Re-install TextBridge from the original 
installation diskettes. Refer to Chapter 2 of this 
manual for information. 


A-16 TextBridge User’s Guide 


__| GLOSSARY OF TERMS 


This glossary defines terms and concepts used in this 
manual. For readers who are new to scanning and 
character recognition concepts, this glossary may be 
useful not only as a reference, but as a primer on 
optical character recognition technology, as well. 


Some definitions provided here contain terms in bold 
letters. This means that these terms are also defined 
elsewhere in the glossary. 


A _application—A software program that enables users 
to perform a task or set of tasks. Sometimes also 
refers to the use (that is, the “application”) of a 
software program. 


ASCII— American Standard Code for Information 
Interchange. ASCII contains codes for 128 control 
characters, alphanumerics, and symbols. A number of 
so-called extended ASCII sets exist that generally 
allocate another 128 codes for accented characters and 
additional symbols not included in the first 128. 


auto page orientation—In TextBridge, a capability 
to correct for the rotation of the page image before 
recognition begins. For example, if the user were 
scanning a document with mixed pages (for example, 
most pages portrait, some landscape pages with large 
tables), TextBridge could perform auto-orientation on 
each page before beginning recognition. 


auto page segmentation—In TextBridge, a capa- 
bility to discern the layout of the page image, and to 
recognize and output text in the correct order. For 
example, in a newsletter, in which columns are often 
of uneven depths and widths, TextBridge recognizes 
the layout of the page and outputs text in the correct 
sequence. In TextBridge Preferences, you can specify 
auto page segmentation on or off. 
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base name— The portion of the document name 
used to identify related page image (TIFF) file names 
created when you use the Save Page Images capability 
of TextBridge. When you type a document name in the 
Save Page Images As dialog, the first four digits of the 
name are used as the base name for the page image 
files. 


brightness—See scanner brightness. 


CCITT—Acronym for Consultative Committee of 
International Telephone and Telegraph, and 
international standards organization which has 
created, among other things, compression standards 
for digital data. TIFF files stored in CCITT Group 3 
and Group 4 compression standards can be recognized 
by TextBridge. 


conversion —A software module that takes text in 
one format (the input format) and processes it to 
another format (the output format). In TextBridge, 
recognized text in its internal format can be converted 
to WordPerfect, for example (or any of a number of 
other supported formats). 


DEVICE statement—In the config. sys file, a line 
that identifies, for example, a scanner’s device driver 
to applications that may need to run the scanner. 


dialog box—In Microsoft Windows, a category of user 
interface screen that requests interactivity (“dialog”) 
with a user of the application. TextBridge displays a 
main dialog from which you can define and initiate 
OCR jobs. 


document name—The file name you enter in either 
the Save Page Images As or Save As dialogs in 
TextBridge. The document name is automatically 
appended with a three-letter extension that indicates 
the format in which the recognized file is saved. 
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drop-out color—A color, or range of colors, that a 
scanner has problems detecting on the page it is 
scanning. This is typically a product of the scanner’s 
own light source. A yellow light source in the scanner, 
for example, will have problems detecting colors in the 
range of yellow to light green. 


Dynamic Data Exchange (DDE)—In the Microsoft 
Windows environment, a standard for sharing data 
among Windows applications. For example, with the 
proper macro in place, a word processor such as 
Microsoft Word for Windows can direct TextBridge to 
scan and recognize text, and import the text to an 
open Word document, from within its own menu 
system. 


edit box—A Windows interface convention shown as 
a rectangular field in a dialog or other area of the 
interface into which a user can type text or in which 
existing text can be edited. In Windows, an edit box 
supports standard ways in which the user can edit 
information in the box. For example, in a typical Save 
As dialog in a Windows application, the area in which 
the output file name is typed is a standard edit box. 


Enhanced mode—The most advanced of the three 
modes in which Microsoft Windows will run. The 
other two modes are Real and Standard. TextBridge 
runs only in Enhanced mode. 


Expanded memory driver—A program that makes 
part of extended memory appear as additional 
expanded memory so that programs that require more 
than the 360K of expanded memory typically 
available on a DOS machine can run. 
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fax image—The representation of a page in the form 
of binary data (usually 200x100 or 200x200 dpi 
resolution) transmitted by a facsimile (fax) machine or 
fax modem card. Computers with fax modem cards 
can receive a fax image and store it on-line as a TIFF 
file. TextBridge can open and recognize the text from 
an on-line fax image stored in TIFF format. 


fax modem— An external device or printed circuit 
board that plugs into a PC enabling the receipt and 
transmission of digital image data across a 
telecommunications (phone) line. With a fax modem 
connecting your PC to a phone line, you can receive 
and transmit document images to and from your PC. 


galley format—The single-column format in which 
TextBridge outputs text recognized from multiple- 
column documents. 


halftone— An image composed of differently-sized 
black dots spaced in such a way as to simulate the 
different gray tones of an original photograph or color 
drawing. In the Preferences dialog, you can specify 
that TextBridge is to ignore halftones when 
performing optical character recognition. 


handles—Solid square objects typically in the four 
corners of a rectangle in a drawing package or other 
application which enable resizing of the rectangle. In 
TextBridge, you can draw a rectangular zone on a 
previewed page image to define the area of the page to 
be recognized. Handles on the zone enable you to 
resize the zone. 
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hypertext—A capability to traverse a textual data 
base (an on-line Help system, for example) in a 
number of different ways: by selecting a subject in an 
index; by stepping sequentially forward and 
backward; by keyword search; by context (that is, 
clicking on a word in text to get its definition or 
another screen of information about it). TextBridge 
uses the Microsoft Windows Help engine to navigate 
its built-in hypertext-based Help system. 


input source—The origin of page images being 
recognized: either the scanner or a TIFF file. 


ISIS— Acronym for Image and Scanner Interface 
Standard developed by Pixel Translations, Inc. ISIS is 
an applications programmers interface (API) for the 
design and development of scanner drivers. Pixel 
Translations and other scanner vendors use ISIS to 
develop scanner drivers. TextBridge supports most of 
the ISIS-compatible scanners available on the market 
today. See also TWAIN. 


language pack—A component of TextBridge that 
enables the application to perform OCR on a 
document composed in a particular language. In the 
TextBridge Preferences dialog, you can specify that 
the document to be recognized is in one of English, 
French, Italian, German, or Spanish. TextBridge 
loads the appropriate language pack before beginning 
recognition. See also recognition language. 


native user interface—In the TWAIN specification, 
the set of screen displays and keyboard controls that a 
TWAIN source driver provides to programs suppor- 
ting the TWAIN device. For example, TextBridge runs 
with scanners that have fully TWAIN-compliant 
source drivers. However, the controls for that scanner 
are provided in the native UI, not in TextBridge. 


noise—An errant mark on a page that can be 
recognized as one or more characters during OCR. 
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optical character recognition (OCR)—A 
technology in which binary images of character shapes 
are analyzed and identified as particular characters 
and output to a text data stream, either in computer 
memory or to a computer file. 


OCR printer—A TextBridge application designed to 
work like a printer from virtually any Windows-based 
fax or imaging program. In the host program, you 
display a fax or other image containing text and use 
the Print command to send it to the OCR printer. The 
OCR printer performs TextBridge OCR, then displays 
the Save As dialog to allow you to save the text to a 
file in the desired text format. 


OS/2—A graphical operating system designed by 
International Business Machines (IBM®) Corporation 
to run on Intel-based personal computers. OS/2 is a 
true multi-processing operating system which can run 
native programs, as well as programs designed for 
Microsoft Windows and DOS. 


output text format—See text format. 

page image—A binary (black and white) picture of a 
page stored in computer memory or on disk. Page 
images are scanned or read from a TIFF file and sent 


to TextBridge for optical character recognition (OCR). 


permanent virtual memory—See virtual 
memory. 


pixel—Short for “picture element,” one of many dots 
that make up a digital image. 


preferences— In TextBridge, the settings that you 
can specify to control the OCR process. 


Preview —In TextBridge, a capability that enables 
you to view, zoom, and zone a page before processing. 
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questionable word—A word that fall below a confi- 
dence threshold built into TextBridge. During OCR, 
TextBridge assigns a confidence value to each word. If 
the value falls below the confidence threshold, and 
you are using the Verifier, TextBridge displays the 
word as questionable. 


RAM disk—A part of your computer’s extended 
memory set up to behave like a hard disk for 
temporary file storage. 


recognition—The TextBridge process during which a 
page image (scanned or on-line TIFF) is analyzed, and 
characters and words are identified and saved as a 
text data stream in memory or in an on-line 
temporary file. Note that the TextBridge recognition 
engine not only performs recognition (OCR), but also 
performs segmentation, orientation (rotation), format 
analysis, and retention of text styles (bold, italic). 


recognition language—The primary language (for 
example, English, French) in which a document is 
composed. In TextBridge, you can specify that the 
document is in one of a number of different languages. 
TextBridge loads the appropriate language pack 
before beginning OCR. 


region—A logical block of type on a page image. 
TextBridge, through its auto-segmentation 
capability locates regions of text on a multi-column 
document, and outputs them in the correct order. 


resolution—The degree of detail, measured in dots 
per inch (dpi), with which a scanner or fax machine 
can input an image. TextBridge can perform OCR 
(optical character recognition) on page images in any 
of the following resolutions (dots per inch): 400x400, 
400x200, 300x300, 200x200, 200x100, and 100x200. 


RTF—Rich Text Format, a text format developed by 


Microsoft Corporation, with embedded codes to 
describes fonts, formatting, and so on. 
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scanner brightness—A setting to determine the 
intensity of light the scanner projects on the page 
being scanned in order to lighten or darken the 
resulting image. Often by adjusting brightness, you 
can manipulate the accuracy of recognition (for 
example, brightening a page whose characters are 
tightly spaced can improve recognition). In 
TextBridge, you can specify scanner brightness in the 
Scanner Settings dialog. 


scanner driver—A program that is written as an 
interface between a software application and a 
scanner. The scanner driver sends requests from the 
application to the scanner in a language the scanner 
can understand. 


Scanner Setup— Part of the TextBridge OCR 
program group in Windows, this program is designed 
to enable you to load the correct high-level driver so 
TextBridge to run with your scanner. See also ISIS 
and TWAIN. 


text format—The word processor, spreadsheet, or 
other file format to which recognized text can be 
converted and output. TextBridge supports output of 
recognized text to these formats: 


Ami Pro (2.0, 3.0) Multimate Advantage 
ASCII (Standard, Smart, PostScript 

Stripped) Prof Write (2.0, 2.2) 
dBase IV RTF (Microsoft’s Rich 
DCA/RFT Text Format) 
DisplayWrite 5 Samna Word IV 
Excel (Mac, 3.0, 4.0) Windows Write 
FrameMaker Word for Windows 2.0 
Interleaf WordPerfect (4.2, 5.1) 
Lotus 1-2-3 WordsStar 
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TIFF image—A binary representation of a page or 
graphic stored in Tag Image File Format, an industry- 
standard image file format. TextBridge can recognize 
text from pages images stored in several variations of 
TIFF, as follows: 


TIFF Uncompressed (Intel header) 
TIFF CCITT-3 (Intel header) 
TIFF CCITT-4 (Intel header) 


TIFF Uncompressed (Motorola header) 
TIFF CCITT-3 (Motorola header) 
TIFF CCITT-4 (Motorola header) 


TIFF (Intel FAXability™ header) 


When you select Save Page Image in the Main 
dialog, TextBridge automatically saves scanned page 
images to files in TIFF CCITT-3 Intel. 


TSR program—A program designed to automatically 
load into memory when you start your system, or to 
stay in memory even after you exit it. 


TWAIN-— An image and scanner interface standard, 
complete with an API, for the development of 
interfaces to imaging devices (scanners, fax machines, 
and so on). TextBridge supports any fully TWAIN- 
compliant scanner or other device that connects to a 
PC and produces binary (black-and-white) images in a 
supported size and resolution. 


Verifier—A capability that enables you to view and, 
if necessary, correct TextBridge recognition decisions 
word by word. A Verifier window similar to the 
Preview window shows you the recognized word and 
its associated image on the scanned page. The 
recognized word is highlighted in an edit box, allowing 
you to type corrections if necessary. 
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virtual memory— Space on your hard disk set up to 
simulate random access memory (RAM) on your PC. 
TextBridge, especially on systems with only 4Mb of 
RAM, must be configured with permanent virtual 
memory, a contiguous group of storage blocks on your 
hard disk. 


Windows-— A graphical user interface (GUD and a 
host of related modules developed by Microsoft 
Corporation for use on DOS-driven personal 
computers. TextBridge runs with Windows, version 
3.1 and later. 


word verifier—See Verifier. 


working directory—In Windows, when an 
application is installed, or at any time thereafter, you 
can designate a directory anywhere in your DOS file 
system as the working directory. You do this in 
Program Manager by selecting the Properties 
command from the File menu. For TextBridge, the 
working directory is a BIN subdirectory in the 
installation directory you specify at installation time. 
The TextBridge installation program chooses by 
default the working directory, C: \TXBRIDGE\BIN, 
although you can select any directory in any partition 
on your DOS file system. 


zone—In the TextBridge Preview window, a 
rectangular border that you can draw around a 
portion of the displayed page image to define the area 
of the page to be processed. 


zoom —In the TextBridge Preview window, the 
capability to magnify (“zoom in”) a page image to full 
resolution and back (“zoom out”) to a resolution that 
enables the entire page image to be viewed. 
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__| INDEX 


A A4,34 
AccuPage, 3-4 
Add More Pages dialog, 3-6 
All Pages command, 3-17 
ASCII, 4-21 
Auto Page Segmentation with Preview, 4-14 
Auto Page Segmentation, 3-3 
Auto Page Segmentation, 4-14 
Auto-brightness, 4—4 
Automatic document feeder (ADF), 3-5 


B__siBrightness, 3-4 
adjusting for best character recognition, 4-3 
auto, 4-4 
use preview window to evaluate, 4-4 
when to increase or decrease, 4—3 
Bulletin board services, xii 


C CCITT, 3-5, 4-15 

Character recognition 
compensating for colors, 4-5 
getting the best, 4-1 

Color on pages to recognize, 4-5 

Columns, 3-3, 4-14 

Confidence threshold, 3-19 

CONFIG. SYS file, 2-5, 2-18 

Customer support, xi, A-2 
bulletin board services, xii 
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Datacopy 111 card, 2-5 
Datacopy scanners, 2—1, 2-5 
Datacopy 730GS scanner, 2-5 
DISCOVER software, 2-5 
Document Quality, 3-3, 4-4 
Documentation conventions, ix 
Documents with pictures, 3-4 
DOS, 1-2, 1-8, 2-1 

Drop-out color, 4-5 


End Verification command, 3—23 
Error messages, A-1, A-12 to A-16 


Fax document quality setting, 4-6 
Fax documents, 1—4 
Fax image 
recognition, 3-8, 4—5 
synthesized, 4-6 
Fax modem, 2-6 
files, 1-3 
recognizing images from, 3-8 
Formats, text, supported by TextBridge, 1-4 


Galley format, 3-3 
Glossary of terms, G—1 
GSplus scanner, 2-5 


Halftones, helping TextBridge ignore them, 4-11 
Hand-held scanners, 1—7 

Help system in TextBridge, 1-9 

HP AccuPage, 3-4 
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| Ignore Photos/Halftones filter, 3-4 
using to improve processing speed, 4—11 
Images 
that TextBridge can read, 4-25 
Initialization (.ini) files, 2-17 
Invert command in Preview, 4-9 
ISIS, 1-6, 2-4, 3-6, 3-14, 3-20 
configuring a scanner, 2—15 


installing and testing a driver, 2-12 
testing the installation, 2-16 
troubleshooting scanner problems, A—3 


L = Language packs, 1-7, 2-1 
installing, 2-8 
Language, 3-3 
Legal, 3-4 
Letter, 3-4 


M = Main dialog, 1-1 
Memory requirements, 1-8, 2-2 
Microprocessor needed to run TextBridge, 1—2, 1-8 
Microsoft Windows, vii, x 
Multi-column documents, 3-3 
processing, 4-14 


N Noise, 3-22, 4-3 


O- _OCR, vii, 1-1, 3-1 
what it stands for, 1-1 
compensating for colors, 4—5 
using the Verifier to improve accuracy, 4—8 
Open dialog, 3-10 


Index I-38 


I 


Optical character recognition. See OCR 
Orientation, 3-3 

auto, 4-12 
OS/2, 1-2 

running TextBridge under, 1-8 
Output formats, 1-4 


Page images, on-line, 1—2 
Page orientation, 3-3 
Auto setting, 4-12 
Page size, 3-4 
Preferences, 3—2 
Auto page orientation, 4-12 
Auto Page Segmentation, 4-14 
Ignore Photos/Halftones option, 4-12 
scanner settings, 4—4 
table of, 3-3 
Preview, 1-8, 3-1, 3-12 
All Pages command, 3-17 
auto page segmentation with, 4-14 
Invert command in, 4—9 
This Page command, 3-17 
toolbox, 3-13 
zone limit, 3-13 
zoning in, 4-10 
zooming in, 3-15 
Preview window, 3-14 
use to evaluate document quality, 4-4 
Previewing pages before processing, 3-12 
Publications, related, x 
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Questionable word, 3-18, 3-21 


Rancho Technology 1201 card, 2-5 
Recognition language, 3-3 
Release Notes, x, 1—4 

Resolution, 3-4 

RTF, 4-21 


Save As dialog, 3-5, 3-7, 4-26 
Save Page Images, 4-15 
Scanner 
Datacopy models, 2-1, 2-4, 2-5 
drivers, x 
drop-out color, 4-5 
installing and testing, 2—4 
system-level driver, 2-1, 2-6 
use and maintain properly, 4—2 
Scanner brightness, 3-4 
Scanner installation steps, 2-4 
Scanner resolution, 3-4 
Scanner Settings, 3-4, 4-4 
Scanner setup program, 2—1, 2-9 
Scanners 
supported by TextBridge, 1-6, 2-4 
hand-held, 1—7 
Scanning and converting a document, 3—5 
Select Source dialog, 2-11 
730GS scanner, 2—5 
Show Special Characters command, 3-21 
Software registration card, 1-7 
Swap file, 1-8 
System requirements, 1-8 


Index 


T Technical support, xi 

Text format, 1-2, 1-4, 3-1 
specifying for recognized text, 3-7, 4-26 

TextBridge 
bulletin board services, xii 
customer support for, xi, A-2 
de-installation, 2-16 
default installation directory, 2-8 
description of, 1-1, 1-2 
disk space requirements, 1-8 
DOS versions supported, 1—2 
embedded in other applications, 1-1, 4-17 
error messages, A-1, A-12 to A-16 
fax filter for improved OCR of fax images, 4-5 
getting the best character recognition, 4-1 
how it works, 1-2 
ignore halftones, 4-11 
initialization (.ini) files, 2-17, A-8 
installation, 2-1 
installing and testing a TWAIN driver, 2-10 
installing and testing an ISIS driver, 2-12 
inverted documents, 4—9 
ISIS scanner problems, A—3 
language packs, 1—7, 2—1, 2-8 
main dialog, 1-1 
memory requirements, 2—2 
microprocessor needed to run, 1—2, 1-8, 2—2 
on-line help, 1-9 
OS/2 support, 1-2, 1-8 
owner registration, vii, 1—7 
pointsize range, 1-4 
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TextBridge (cont.) 
preferences, 3—2, 3-3 
preview capability, 1-3, 3-12 
processing multiple pages, 4-6 
processing non-TIFF images, 4—22 
program group in Windows, x, 2-9, 4-18 
recognition languages, 1—4, 2-8 
related publications, x 
release notes, x, 1-4 
requirements to run, 1-8 


running OCR from within other applications, 
4-17, 4-20 


running with Datacopy scanners, 2—5 
Save As dialog, 3-7, 4-26 

scan and convert a document, 3—5 
scanner drivers disk, 2-14 

scanner installation steps, 2-4 
scanner setup program, 2-1, 2—9 
scanners supported, 1-6, 2—4 

SETUP program, 2-7, 4-18 

software installation, 2-6 

supported image resolutions, 4-25 
system optimization, 2—2 

text format specification for recognized text, 3-7 
text formats supported, 1-4 

TIFF file recognition, 3-10 

tips and techniques for using, 4-1 
tips for efficient processing, 4—10 
troubleshooting, A—1 

TWAIN scanner problems, A—-5 

types of documents it can OCR, 1-4 
user’s guide organization, viii 

using with a TWAIN scanner, 3-6, 3-14, 3-20 
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TextBridge (cont.) 
using with an ISIS scanner, 3-6, 3-14, 3-20 
verify text during OCR, 1-3, 3-18 
virtual memory, setting up, A-9 
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